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RECOMBINANT TYPE II RESTRICTION ENDONUC LEASES # 
Mmel AND RELATED ENDONUCLEASES AND METHODS FOR PRODUCING 

THE SAME 

BACKGROUND OF THE INVENTION 

The present invention relates to a DNA 
{deoxyribonucleic acid) fragment, which fragment codes 
for one polypeptide possessing two related enzymatic 
functions, namely an enzyme which recognizes the DNA 
sequence 5 1 -TCC (Pu) AC-3 1 and cleaves the phosphodiester 
bond between the 20th and 21st residues 3' to this 
recognition sequence on this DNA strand, and- between the 
18th and 19th residues 5' to the recognition sequence on 
15 the complement strand 5 1 -GT (Py) GGT-3 1 to produce a 2 

base 3* extension (hereinafter referred to as the Mmel 
restriction endonuclease) , and a second enzymatic 
activity that recognizes the same DNA sequence, 5'- 
TCC(Pu) AC-3 9 , but modifies this sequence by the addition 
of a methyl group to prevent cleavage by the Mmel 
endonuclease. The present invention also relates to a 
vector containing the DNA fragment, a transformed host 
containing this DNA fragment, and an improved process 
for producing Mmel restriction endonuclease from such a 
transformed host. The present invention also relates to 
a process for identifying additional DNA fragments that 
encode enzymes having the same general properties as 
Mmel but potentially having unique DNA recognition 
sequences. This process depends on the use of the amino 
acid sequence of the Mmel enzyme presented in this 
application, or subsequently on the additional sequences 
identified through this process. The invention also 
relates to additional DNA fragments, identifiable 
through the process described, each of which encodes a 
35 polypeptide having significant amino acid sequence 

similarity to the Mmel polypeptide. The polypeptides 
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encoded by these DNA fragments are predicted to perform 
similar functions to Mmel. Specifically, they are 
predicted to possess the dual enzymatic functions of 
cleaving DNA in a specific manner at a relatively far 
distance from the specific recognition sequence and also 
modifying their recognition sequences to protect the 
host DNA from cleavage by endonuclease activity. An 
example of such an enzyme identified by this process is 

CstMI (see U.S. Application Serial No.: filed 

concurrently herewith) . CstMI was identified as a 
potential endonuclease because of its highly significant 
amino acid sequence similarity to Mmel. CstMI recognizes 
the sequence 5 1 -AAGGAG-3 1 and cleaves the phosphodiester 
bond between the 20th and 21st residues 3' to the 
recognition sequence on this DNA strand, and between the 
18th and 19th residues 5* to the recognition sequence on 
the complement strand 5 ' -CTCCTT-3 ■ to produce a 2 base 
3' extension. 

Restriction endonucleases are a class of enzymes 
that occur naturally in prokaryotes . There are several 
classes of restriction systems known, of which the type 
II endonucleases are the class useful in genetic 
engineering. When these type II endonucleases are 
purified away from other contaminating prokarial 
components, they can be used in the laboratory to break 
DNA molecules into precise fragments. This property 
enables DNA molecules to be uniquely identified and to 
be fractionated into their constituent genes. 
Restriction endonucleases have proved to be 
indispensable tools in modern genetic research. They 
are the biochemical ' scissors 1 by means of which genetic 
engineering and analysis is performed. 



WO 2004/007670 PCT/US2003/021570 

-3- 



Restriction endonucleases act by recognizing and 
binding to particular sequences of nucleotides (the 
•recognition sequence 1 ) along the DNA molecule. Once 
bound, the type II endonucleases cleave the molecule 
within, or to one side of, the sequence. Different 
restriction endonucleases have affinity for different 
recognition sequences. The majority of restriction 
endonucleases recognize sequences of 4 to 6 nucleotides 
in length, although recently a small number of 
restriction endonucleases which recognize 7 or 8 
uniquely specified nucleotides have been isolated. Most 
recognition sequences contain a dyad axis of symmetry 
and in most cases all the nucleotides are uniquely 
specified. However, some restriction endonucleases have 
degenerate or relaxed specificities in that they 
recognize multiple bases at one or more positions in 
their recognition sequence, and some restriction 
endonucleases recognize asymmetric sequences, ifaelll, 
which recognizes the sequence 5'-GGCC-3', is an example 
of a restriction endonuclease having a symmetrical, non- 
degenerate recognition sequence; Haell, which recognizes 
5'-(Pu)GCGC(Py)-3' typifies restriction endonucleases 
having a degenerate or relaxed recognition sequence; 
while BspMI, which recognizes 5 ' -ACCTGC-3 ' typifies 
restriction endonucleases having an asymmetric 
recognition sequence. Type II endonucleases with 
symmetrical recognition sequences generally cleave 
symmetrically within or adjacent to the recognition 
site, while those that recognize asymmetric sequences 
tend to cleave at a distance of from 1 to 20 nucleotides 
to one side of the recognition site. The enzyme of this 
application, Mmel, (along with CstMI) has the 
distinction of cleaving the DNA at the farthest distance 
from the recognition sequence of any known type II 
restriction endonuclease. More than two hundred unique 
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restriction endonuc leases have been identified among 
several thousands of bacterial species that have been 
examined to date. 

5 A second coirponent of restriction systems are the 

modification methylases. These enzymes are 
complementary to restriction endonucleases and they 
provide the means by which bacteria are able to protect 
their own DNA and distinguish it from foreign, infecting 

10 DNA. Modification methylases recognize and bind to the 

same nucleotide recognition sequence as the 
corresponding restriction endonuclease, but instead of 
breaking the DNA, they chemically modify one or other of 
the nucleotides within the sequence by the addition of a 

15 methyl group. Following methylation, the recognition 

sequence is no longer cleaved by the restriction 
endonuclease. The DNA of a bacterial cell is modified 
by virtue of the activity of its modification methylase 
and it is therefore insensitive to the presence of the 

20 endogenous restriction endonuclease. It is only 

unmodified, and therefore identifiably foreign, DNA that 
is sensitive to restriction endonuclease recognition and 
cleavage. Modification methyl transf erases are usually 
separate enzymes from their cognate endonuclease 

25 partners. In some cases, there is a single polypeptide 

that possesses both a modification methyltransf erase 
function and an endonuclease function, for example, 
Eco57l. In such cases, there is a second 
methyl transf erase present as part of the restriction- 

30 modification system. In contrast, the Mmel system of the 

present application has no second methyltransf erase 
accompanying the endonuclease-methyltransf erase 
polypeptide . 
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Endonucleases are named according to the bacteria 
from which they are derived. Thus, the species 
Haemophilus aegyptius, for example synthesizes 3 
different restriction endonuc leases, named Hael, Haell 
and HaelXl. These enzymes recognize and cleave the 
sequences 5 1 - (W) GGCC (W) -3 ' , 5 • - (Pu) GCGC (Py) -3 ' and 5'- 
GGCC-3' respectively. Escherichia coli RY13, on the 
other hand, synthesizes only one enzyme, BcoRX, which 
recognizes the sequence 5 ' -GAATTC- 3 ' . 



While not wishing to be bound by theory, it is 
thought that in nature, restriction endonucleases play a 
protective role in the welfare of the bacterial cell. 
They enable bacteria to resist infection by foreign DNA 
15 molecules such as viruses and plasmids that would 

otherwise destroy or parasitize them. They impart 
resistance by binding to infecting DNA molecules and 
cleaving them in each place that the recognition 
sequence occurs. The disintegration that results 
inactivates many of the infecting genes and renders the 
DNA susceptible to further degradation by exonucleases . 



More than 3000 restriction endonucleases have been 
isolated from various bacterial strains. Of these, more 
25 than 240 recognize unique sequences, while the rest 

share common recognition specificities. Restriction 
endonucleases which recognize the same nucleotide 
sequence are termed "isoschizomers." Although the 
recognition sequences of isoschizomers are the same, 
they may vary with respect to site of cleavage (e.g., 
XmaL v. SmaT, Endow, et al., J". Mol. Biol. 112:521 

(1977) ; Waalwijk, et al . , Nucleic Acids Res. 5:3231 

(1978) ) and in cleavage rate at various sites (Xhol v. 
PaeR7l, Gingeras, et al., Proc. Natl. Acad. Sci. U.S.A. 

35 80:402 (1983) ) . 
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Restriction endonucleases have traditionally been 
classified into three major classes; type I , type II and 
type III. The type I restriction systems assemble a 
multi -peptide complex consisting of restriction 
5 polypeptide, modification polypeptide, and specificity, 

or DNA recognition, polypeptide. Type I systems require 
a divalent cation, ATP and S-adenylosyl-methionine (SAM) 
as cof actors. Type I systems cleave DNA at random 
locations up to several thousand basepairs away from 

10 their specific recognition site. The type III systems 

generally recognize an asymmetric DNA sequence and 
cleave at a specific position 20 to 30 basepairs to one 
side of the recognition sequence. Such systems require 
the cofactor ATP in addition to SAM and a divalent 

15 cation. The type III systems assemble a complex of 

endonuclease polypeptide and modification polypeptide 
that either modifies the DNA at the recognition sequence 
or cleaves. Type III systems produce partial digestion 
of the DNA substrate due to this competition between 

20 their modification and cleavage activities, and so have 

not been useful for genetic manipulation. 

Mmel does not require ATP for DNA cleavage activity 
and it cleaves to completion; thus it can be classified 

25 as a type II endonuclease. Unlike other type II enzymes, 

however, Mmel consists of a single polypeptide that 
combines both endonuclease and modification activities 
and is sufficient by itself to form the entire 
restriction modification system. Mmel also cleaves the 

30 farthest distance from the specific DNA recognition 

sequence of any type II endonuclease (as does CstMI of 
this application) . Mmel is quite large and appears to 
have three functional domains combined in one 
polypeptide. These consist of an amino- terminal domain 

35 which contains the endonuclease DNA cleavage motif and 
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which may also be involved in DNA recognition, a DNA 
modification domain most similar to the gamma-class N6mA 
methyl transferases, and a carboxy- terminal domain 
presumed to be involved in dimer formation and possibly 
5 DNA recognition. The enzyme requires SAM for both 

cleavage and modification activity. The single Mmel 
polypeptide is sufficient to modify the plasmid vector 
carrying the gene in vivo to provide protection against 
Mmel cleavage in vitro, yet it is also able to cleave 
10 unmodified DNAs in vitro when using the endonuclease 

buffer containing Mg++ and SAM. 

There is a continuing need for novel type II 
restriction endonucleases . Although type II restriction 

15 endonucleases which recognize a number of specific 

nucleotide sequences are currently available, new 
restriction endonucleases which recognize novel 
sequences provide greater opportunities and ability for 
genetic manipulation. Each new unique endonuclease 

20 enables scientists to precisely cleave DNA at new 

positions within the DNA molecule, with all the 
opportunities this offers. 



SUMMARY OF THE INVENTION 

25 

In accordance with the present invention, there is 
provided a novel DNA fragment encoding a novel 
restriction endonuclease, obtainable from Methylophilus 
methylotrophus (NEB#1190) . The endonuclease is 
30 hereinafter referred to as "Mmel ,, / which endonuclease: 

(1) recognizes the degenerate nucleotide sequence 
B'-TCCfPujAC-B 1 in a double- stranded DNA 
molecule as shown below: 
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5 ' -TCC(Pu) AC-3 • 
3 1 -AGG (Py) TG-5 ' 

(wherein G represents guanine, C represents 
cytosine, A represents adenine, T represents 
thymine, (Pu) represents a purine, either A or 
G, and (Py) represents a pyr imidine , either C 
or T) ; 

(2) cleaves DNA in the phosphodiester bond 
following the 20th nucleotide 3' to the 
recognition sequence 5 ■ -TCC(Pu) AC-3 and 
preceding the 18th nucleotide 5 1 to the 
complement strand of the recognition sequence 
5 1 -GT(Py)GGA-3 1 to produce a 2 base 3' 
extension: 

5 ' -TCC (Pu) AC (N20) /-3 1 
3»-AGG(Py)GT(N18)/-5' ; and 

(3) methylates the recognition sequence specified 
in (1) in vivo to protect the host DNA from 
cleavage by the Mmel endonuclease activity; 

i 

The invention further relates to additional DNA 
fragments, each of which is identified to encode 
polypeptides which share significant sequence similarity 
to the Mmel restriction-modification polypeptide. The 
DNA fragment encoding the Mmel polypeptide enables the 
identification of these additional potential 
endonucleases by using similarity searching of the Mmel 
sequence against sequences available in databases, such 
as GENBANK, using a program such as BLAST (Altschul, et 
al. Nucleic Acids Res. 25:3389-3402 (1997)). These DNA 
fragments, as well as any other fragments with such 
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similarity to Mmel that may be deposited in the 
databases in the future, are candidates which may encode 
polypeptides that are similar to Mmel, in that the 
polypeptides encoded act as both restriction 
5 endonuclease and methyl transferase. These polypeptides 

may, like Mmel, cleave DNA at a similarly far distance 
from the recognition sequence, in the range of 18 to 20 
nucleotides or more, which character is unique and 
useful in certain molecular biology technologies. 

10 Specifically these polypeptides contain amino acid 

motifs common to N6mA DNA methyl transferases in the 
middle of the polypeptide, have a motif common to 
restriction endonucleases and located in the amino- 
terminal section of the polypeptides, consisting of the 

15 amino acids D/E(X8-X12)D/EXK, and have a region of 

several hundred amino acids following the conserved 
methyltransferase motifs which are significantly similar 
to this region of Mmel and are believed to serve as a 
dimerization and possibly a DNA sequence recognition 

20 domain. An example of such a polypeptide, CstMI, is 

presented. CstMI has been shown to recognize the 6 base 
pair asymmetric sequence 5 ' -AAGGAG-3 ' and to cleave the 
DNA in the same manner as Mmel; 5 1 -AAGGAGN20/N18-3 ' . The 
endonuclease encoded by these DNA fragments may be 

25 produced by the process used for Mmel, as described 

below . 



The present invention further relates to a process 
for the production of the restriction endonuclease Mmel. 
This process comprises culturing a transformed host/ 
such as E. coll, containing the DNA fragment encoding 
the Mmel restriction system polypeptide, collecting the 
cultured cells, obtaining a cell-free extract therefrom 
and separating and collecting the restriction 
endonuclease Mmel from the cell-free extract. The 
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present invention further relates to a process for the 
production of the restriction endonuc leases encoded by 
the DNA sequences identified as homologous to Mmel. This 
process comprises culturing a transformed host, such as 
E. col± f containing the gene for these restriction 
systems, collecting the cultured cells, obtaining a 
cell-free extract therefrom and separating and 
collecting the restriction endonuclease from the cell- 
free extract. 

BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 - Agarose gel showing Mmel cleavage of 
lambda, T7, phiX174, pBR322 and pUCl9 DNAs. 

Figure 2 - DNA sequence of the Mmel gene locus (SEQ 
ID N0:1) . 

Figure 3 - Amino acid sequence of the Mmel gene 
20 locus (SEQ ID NO: 2) . 

Figure 4 - Agarose gel showing Mmel cleavage ^of 
pTBMmel . 1 DNA and unmodified DNA substrates. 

25 Figure 5 - Agarose gel showing Mmel cleavage of 

unmethylated, hemi -methylated and fully methylated DNA 
substrates . 

Figure 6 - Incorporation of labeled methyl groups 
30 into unmethylated, hemi -methylated and fully methylated 

DNA substrates. 
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Figure 7 - Multiple sequence alignment of Mmel 
amino acid sequence (SEQ ID NO: 3 through SEQ ID NO: 14) 
and homologous polypeptides from public databases. 
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DETAILED DESCRIPTION OF THE INVENTION 

The recognition sequence and cleavage site of the 
endonuclease of the present invention were previously 
described (Boyd, Nucleic Acids Res. 14: 5255-5274 
(1986)). However the Mmel enzyme proved difficult to 
produce from the native host, Methylophilus 
methylotrophus, due to very low yield of the enzyme and 
the relative difficulty of growing the M. methylotrophus 
host in large quantity. To overcome these limitations to 
producing Mmel, the present application describes the 
identification of the DNA sequence encoding the Mmel 
gene and the expression of this Mmel gene in a suitable 
host, in the present instance E. coll. This manipulation 
of the Mmel encoding DNA fragment results in both a 
significant increase in the amount of enzyme produced 
per gram of cells and a significant increase in ease of 
growth of large amounts of cells containing Mmel enzyme. 

Several standard approaches typically employed by 
persons skilled in the art of cloning were applied to 
the task of cloning of Mmel without success. 
Specifically, the methylase selection approach (Wilson, 
et al., U.S. Patent No. 5,200,333) was attempted 
unsuccessfully. Several random libraries of M. 
methylotrophus DNA were constructed in E. coll and 
challenged by digesting with Mmel, but no Mmel methylase 
containing clones were obtained. 

A second approach was also attempted but failed. In 
this approach, antibodies specific for N6mA were used to 
screen a library of random clones constructed in a 
lambda phage replacement vector. The approach was 
successful in obtaining methylase positive clones, but 
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all examined were found to express the methyltransf erase 
of the second restriction system in M. methylotrophus , 
the Mmell methylase (recognition sequence 5'-GATC-3') 
rather than the desired Mmel methylase activity. 

5 

The successful approach to obtain the desired DNA 
fragment encoding the Mmel restriction system involved 
several steps. First a novel purification procedure was 
developed to purify the Mmel endonuclease peptide to 

10 homogeneity from M. methylotrophus. Once this ultra pure 

Mmel endonuclease polypeptide was successfully obtained 
in a significant amount, amino acid sequence from the 
amino terminus and from internal cyanogen-bromide 
degradation peptides was determined. Using the amino 

15 acid sequence obtained, degenerate DNA primers 

conplementary to the DNA coding for the amino acid 
sequences were synthesized and used to PCR amplify a 
portion of the Mmel gene. The DNA sequence of this 
portion of the Mmel gene was determined. The entire Mmel 

20 endonuclease gene and surrounding DNA sequences were 

then obtained by applying the inverse PCR technique. A 
number of primers matching the DNA sequence obtained 
were designed, synthesized and used in combination with 
numerous different templates. The inverse PCR templates 

25 were produced by digesting M. methylotrophus genomic DNA 

with various restriction endonucleases and then ligating 
the cut M. methylotrophus DNA at low concentration to 
obtain circular molecules. The various primers were 
tried in combinations with the various templates to find 

30 primer- template combinations that produced a specific 

PCR amplification product. The products thus obtained 
were sequenced. Once the DNA sequence encoding the 
entire Mmel endonuclease gene was obtained, primers were 
designed to specifically amplify the gene from M. 

35 methylotrophus genomic DNA. The amplified gene was 
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inserted into an expression vector and cloned into an E. 
coli host. The host was tested and found to both express 
Mmel endonuclease activity and to in vivo modify the 
recombinant expression vector such that it was protected 
against Mmel endonuclease activity in vitro. 

This finding that the single polypeptide encoding 
the Mmel endonuclease also provided in vivo protection 
against Mmel is in contrast to the previously published 
information on Mmel (Tucholski, Gene 223:293-302 
(1998)). Specifically, this reference taught that the 
Mmel endonuclease polypeptide did not provide protection 
against Mmel endonuclease cleavage. This reference 
reported a separate methyl transferase of 48kD as 
required to modify the Mmel site on both strands and 
thus block cleavage by the Mmel endonuclease. 
Specifically, the reference teaches that the Mmel 
endonuclease polypeptide modifies the adenine in the top 
strand of the recognition sequence only, S'-TCCRAC-S' 
and that such modified DNA is cut by the Mmel 
endonuclease. The DNA fragment of the present invention 
encodes the Mmel endonuclease gene, which when grown 
alone in an E. coli host renders the vector containing 
the Mmel endonuclease resistant to cleavage by the 
purified Mmel endonuclease. Further, the Mmel 
endonuclease produced from this fragment does not cleave 
a DNA fragment modified at the adenine of the top 
strand, 5 ' -TCCRAC-3 ' when no modification of the 
opposite, or bottom strand is present. This is in 
contrast to the teaching of the Tucholski reference. 
Also, the Mmel endonuclease of this application does 
cleave a DNA fragment in which the adenine residue in 
the bottom strand is modified 5 ' -GTYGGA-3 1 in contrast 
to the teaching of the Tucholski reference. When both 
the top strand and the bottom strand are modified at the 
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adenine residues, the Mmel endonuclease does not cleave 
the DNA. No second methyltransf erase gene, such as 
reported in the Tucholski reference, was found adjacent 
to the Mmel endonuclease gene. There is an open reading 
frame immediately 3 1 to the Mmel endonuclease gene which 
would encode a protein of approximately the reported 
size of such a second methyltransf erase activity (48kD) . 
However, this potential polypeptide does not have the 
amino acid motifs found in methyltransf erases, nor did 
it provide protection against Mmel endonuclease when 
cloned in E. coli. While the Tulchoski reference taught 
the necessity of a second methyltransf erase polypeptide 
to provide protection against Mmel endonuclease activity 
for the host cell, it is demonstrated in the present 
application that the DNA fragment encoding the Mmel 
endonuclease polypeptide is sufficient to provide such 
protection- Additionally, the eleven DNA fragments 
described herein which encode amino acid sequences 
similar to Mmel are not flanked by any recognizable DNA 
methyltransf erase genes. This indicates that these 
polypeptides are also likely to provide both protection 
for the host DNA and endonuclease activity against 
unmodified DNA substrates on their own, without having a 
second methyltransf erase as part of the restriction 
modification system. This contrasts with other type II 
restriction modification systems. 

The same group (Tucholski, Gene 223: 293-302 
(1998), and Anna Podhajska, personal communication) had 
previously reported an amino acid sequence of eight 
residues for a single internal CnBr digestion fragment 

(sequence GRGRGVGV (SEQ ID NO: ) . PCR based on this 

sequence was attempted yet failed repeatedly. This 
sequence was found to be unrelated to Mmel once the 
actual Mmel amino acid sequence was determined in 
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accordance with the present invention. Therefore correct 
internal amino acid sequences determination, which 
enabled the cloning of the Mmel gene, depended on the 
novel purification method described in this application 
for the production of sufficiently pure Mmel in large 
enough quantity to determine cyanogen bromide internal 
fragment amino acid sequences, as performed in this 
Application. 

In Example II we obtained Mmel by culturing a 
transformed host carrying the Mmel gene, such as E. coli 
ER2683 carrying pTBMmel.l and recovering the 
endonuclease from the cells. A sample of E. coli ER2683 
carrying pTBMmel.l (NEB#1457) has been deposited under 
the terms and conditions of the Budapest Treaty with the 
American Type Culture Collection (ATCC) on July 3, 2002 
and bears the Patent Accession No. PTA-4521. 

For recovering the enzyme of the present invention 
E. coli carrying pTBMmel.l (NEB#1457) may be grown using 
any suitable technique. For example, E. coli carrying 
pTBMmel.l may be grown in Luria broth media containing 
lOOjxg/ml ampicillin and incubated aerobically at 37°C 
with aeration. Cells in the late logarithmic stage of 
growth are induced by adding 0.3iriM IPTG, grown for an 
additional 4 hours., collected by centrifugation and 
either disrupted immediately or stored frozen at -70°C. 

The Mmel enzyme can be isolated from E. coli 
carrying pTBMmel.l cells by conventional protein 
purification techniques. For example, cell paste is 
suspended in a buffer solution and treated by 
sonication, high pressure dispersion or enzymatic 
digestion to allow extraction of the endonuclease by the 
buffer solution. Intact cells and cellular debris are 
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then removed by centr if ligation to produce a cell-free 
extract containing Mmel. The Mnael endonuclease, along 
with its corresponding intrinsic methylase activity, is 
then purified from the cell-free extract by ion-exchange 
5 chromatography, affinity chromatography, molecular sieve 

chromatography, or a combination of these methods to 
produce the endonuclease of the present invention. 

The present invention also relates to methods for 

10 identifying additional DNA fragments, each of which 

encodes a polypeptide having significant amino acid 
sequence similarity to the fcfanel polypeptide. The 
polypeptides encoded by these DNA fragments are 
predicted to perform similar functions to Mmel. 

15 Specifically, they are predicted to possess the dual 

enzymatic functions of cleaving DNA in a specific manner 
at a relatively far distance from the specific 
recognition sequence and also modifying their 
recognition sequences to protect the host DNA from 

20 cleavage by their endonuclease activity. Once the amino 

acid sequence of the Mmel endonuclease was determined as 
described in this application, sequences deposited in 
databases can be compared to this Mmel sequence to find 
those few sequences that are highly significantly 

25 similar to Mmel. This method is similar to that of U.S. 

Patent No. 6, 383, 770 (Roberts, et al . ) , except that here 
we are searching for similarity to the Mmel endonuclease 
sequence, rather than searching for sequences that match 
a database of methyl transferase or endonuclease proteins 

30 and then examining any unidentified open reading frames 

next to potential methyl transf erase open reading frames. 
Prior to identifying the Mmel amino acid sequence, the 
DNA sequences coding for proteins related to Mmel had 
not been included in the database of restriction and 

35 methyl transf erase gene sequences utilized by Roberts, et 

al., supra since these sequences had not been linked to 
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any known endonuclease function. The method disclosed 
herein of identifying potential Mmel-like endonucleases 
is thus more specific than the method of U.S. Patent No. 
6,383,770 (Roberts, et al.). 

Similarity searching of the Mmel sequence against 
sequences available in databases, such as GENBANK, is 
accomplished using a program such as BLAST (Altschul, et 
al. Nucleic Acids Res. 25:3389-3402 (1997)). A sequence 
with an expectation value (E) score of less than E= e~ 10 
is considered a potential candidate endonuclease. 
Sequences that give expectation values that are much 
lower, such as less than E=e" 30 is to be considered as 
highly likely to be endonucleases like Mmel. Such 
candidate Mmel-like peptides are further examined to see 
if they conform to the domain architecture that Mmel 
exhibits. A true candidate will contain an endonuclease 
fold motif, usually of the form (D/E)X8-X12 (D/E)XK in 
the amino- terminal portion of the peptide, (Aravind et 
al. Nucleic Acid Res. 28:3417-3432 (2000)). A true 
candidate will contain methyl transferase motifs in the 
middle portion of the peptide similar to gamma class N6- 
methyl adenine methyl transferases, and sequences similar 
to the carboxyl portion of Mmel in the carboxyl portion 
of the candidate peptide. Such a BLAST search performed 
on June 12, 2003 returned the following sequences as 
highly significantly similar to Mmel: 
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10 



Most of these proteins are labeled as hypothetical 
or putative in their database entries. A number of these 
appear to be full-length polypeptides, such as sequence 
#2 above: GcrY. Such candidates can be expressed as 
described in Roberts to identify the expected 
endonuclease activity. Some endonuclease genes may be 
inactive in the particular strain used for sequencing 
(Lin, et al. Proc. Natl. Acad. Sci. USA 98:2740-2745 
(2001)) . In such a circumstance it may prove possible to 
express functional endonucleases by repairing the 
mutations that have inactivated these genes. Several of 
the Mmel homologs, such as #7 (SEQ ID NO: 14) (Deinococcus 
radiodurans DR2267) and #8 (SEQ ID NO: 13) (Deinococcus 
radiodurans DR0119.1) have disruptions in the open 
15 reading frames. DR2267 has a stop codon, TAG, which 

prematurely terminates the open reading frame, in a 
position where Mmel has a glutamate amino acid coded for 
by the codon GAG. By changing this TAG stop codon to GAG 
it may be possible to reactivate this potential 
endonuclease gene. DR0119.1 is also disrupted, in that 
it has a frameshift that disrupts open reading frame. 
The Mmel sequence may be used as a guide to direct where 
to repair this frameshift by maximizing the similarity 
of the DR0119.1 sequence to the Mmel sequence. This may 
25 well restore DR0119 . 1 endonuclease activity. 

An alternative way to generate potential new 
endonucleases is to take advantage of their similar 
domain structure by performing domain swapping. One may 
be able to swap the amino terminal domain of an Mmel- 
like peptide, for the amino terminal domain in the Mmel 
protein, for example by swapping the sequence of the 
potential new gene up to the first methyltransf erase 
motif (motif X, n Gly Ala His Tyr Thr Ser" into Mmel to 
35 replace this portion of > Mmel up to the same sequence. 
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This approach may be particularly useful when only a 
partial sequence is available or a potential gene has 
lost function due to multiple mutations. This approach 
will create a chimeric protein that potentially has 
endonuclease activity and cleaves at a distance away 
from the recognition sequence, like Mmel, but that 
recognizes a novel DNA sequence. One may also find 
sequences in the databases that are highly similar to 
Mmel but that are partial. For example, sequence #11 
(SEQ ID NO: 9) above (Pseudomonas fluorescens) is from a 
small fragment of DNA sequence in the database. To 
obtain a functional endonuclease like Mmel from this 
sequence one can use inverse PCR or other techniques to 
obtain DNA sequence adjacent to the fragment reported, 
then use that sequence to obtain an intact endonuclease 
gene. 

Once a sequence is identified the potential 
endonuclease can be expressed and characterized as 
described in Roberts, et al. supra. Here, however, there 
is no separate methyl transferase gene to express along 
with the endonuclease. Once such a potential 
endonuclease is cloned and expressed in a suitable host, 
such as in E. coli, a cell free extract is prepared and 
analyzed to detect any endonuclease activity. Such an 
endonuclease assay must include the SAM cofactor 
required by these endonucleases . Once specific DNA 
cleavage activity is found the recognition sequence and 
cleavage site may be determined by standard methods. 
(Schildkraut, (1984) In Genet. Eng. (N Y) Vol 6. (Setlow 
J.K., Hollaender, A. Ed.), pp 117-140. Plenum Press, New 
York. "Screening for and characterizing restriction 
endonucleases . " ) 
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The enzymes so identified can be isolated from E. 
coll cells carrying the DNA fragment in a suitable 
vector by conventional protein purification techniques. 
For example, cell paste is suspended in a buffer 
solution and treated by sonication, high pressure 
dispersion or enzymatic digestion to allow extraction of 
the endonuclease by the buffer solution. Intact cells 
and cellular debris are then removed by centrif ugation 
to produce a cell-free extract containing the enzyme. 
The endonuclease, along with its corresponding intrinsic 
methylase activity, is then purified from the cell-free 
extract by ion-exchange chromatography, affinity 
chromatography, molecular sieve chromatography, or a 
combination of these methods to produce the endonuclease 
15 of the present invention. 

These DNA fragments, as well as any other fragments 
with such similarity to Mmel that may be deposited in 
the databases in the future, are predicted to encode 
polypeptides that are similar to Mmel, in that the 
polypeptides encoded act as both restriction 
endonuclease and methyltransf erase. These polypeptides 
may, like Mmel, cleave DNA at a similarly far distance 
from the recognition sequence, in the range of about 18 
2 5 to 20 nucleotides or more, which character is unique and 

useful in certain molecular biology technologies. 

An example of such an enzyme identified by this 
process is CstMI (see U.S. Application Serial 

30 No. , filed concurrently herewith). CstMI was 

identified as a potential endonuclease because of its 
highly significant amino acid sequence similarity to 
Mmel. CstMI is encoded by sequence #2 above (SEQ ID 
NO: 8), which gave highly significant Expectation value 

35 of e ~ 1?1 when compared to Mmel by BLAST. CstMI recognizes 
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the 6 base pair asymmetric sequence 5 1 -AAGGAG- 3 1 and 
cleaves the DNA in the same mariner as Mmel: it cleaves 
the phosphodiester bond between the 20th and 21st 
residues 3 1 to this recognition sequence on this DNA 
5 strand, and between the 18th and 19th residues 5' to the 

recognition sequence on the complement strand 5 1 -CTCCTT- 
3' to produce a 2 base 3' extension. 

The present invention is further illustrated by the 
10 following Examples. These Examples are provided to aid 

in the understanding of the invention and are not 
construed as a limitation thereof . 



The references cited above and below are herein 
15 incorporated by reference. 



EXAMPLE X 



PURIFICATION OF MmeT ENDONUCL.EASE 

20 

A single colony of Methylophilus methyl otrophus 
(NEB#1190) was grown for 24 hrs in 1 liter of medium M 
(0.08 m CuS0 4 , 0.448 *xM MnS0 4 , 0.348 m ZnS0 4/ 6.0 m 
FeCl 3 , 18 m CaC0 3 , 1.6 iriM MgS0 4/ 9.0 nM NaH 2 P0 4/ 10 . 9 mM 
25 K2HP0 4/ 13.6 mM (NH 4 ) 2 S0 4 ) for 24 hours. This culture was 

used to inoculate 100 liters of medium M. The cells 
were grown aerobically at 37 °C, overnight, until 
stationary. Five 100-liter fermentations were required 
to harvest 752 grams of wet cell pellet. 

30 

750 gram of M. methylotrophus cell pellet was 
suspended in 2.25 liters of Buffer A (20 mM Tris-HCl (pH 
8.0), 50 mM NaCl, 1.0 mM DTT, 0.1 mM EDTA, 5% Gycerol) 
and passed through a Gaulin homogenizer at -12,000 psig. 
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The lysate was centrifuged at -13,000 x G for 40 minutes 
and the supernatant collected. 

The supernatant solution was applied to a 500 ml 
Heparin Hyper-D column (BioSepra SA) which had been 
equilibrated in buffer A. A 1.0 L wash of buffer A was 
applied, then a 2 L gradient of NaCl from 0.05 M to 1 M 
in buffer A was applied and fractions were collected. 
Fractions were assayed for Mme I endonuclease activity 
by incubating with 1 /xg Lambda DNA (NEB) in 50 fil 
NEBuffer 1, supplemented with 32 /xM S-adenosyl-L- 
methionine (SAM) for 15 minutes at 37° C. Mmel activity 
eluted at 0.3 M to 0.4 M NaCl. 

The Heparin Hyper-D column fractions containing the 
Mme I activity were pooled, diluted to 50 mM NaCl with 
buffer A (without NaCl) and applied to a 105 ml Sourcel5 
Q column (Amersham Biotech) which had been equilibrated 
with buffer A. A 210 ml wash with buffer A was applied 
followed by a 1.0 L gradient of NaCl from 0.05 M to 0.7 
M in buffer A. Fractions were collected and assayed 
from Mme I endonuclease activity. The Mme I activity was 
found in the unbound fraction. 

The SourcelS Q pool was loaded onto a 22 ml AF- 
Heparin-TSK column (TosoHaas) which had been 
equilibrated with buffer A. A wash of 44 ml buffer A 
was applied, followed by a linear gradient of NaCl from 
0.05 M to 1.0 M in buffer A. Fractions were collected 
and assayed from Mme I endonuclease activity. The Mme I 
activity eluted between 0.26 M and 0.29 M NaCl. The 
fractions containing activity were pooled and dialyzed 
against buffer B (20 mM NaP0 4 (pH 7.0), 50 mM NaCl, 1.0 
mM DTT, 0.1 mM EDTA, 5% Glycerol). 
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The dialyzed AF-Heparin-TSK pool was loaded onto a 
6 ml ResourcelS S column (Amersham Biotech) which had 
been equilibrated with buffer B. A wash of 12 ml buffer 
B was applied, followed by a linear gradient of NaCl 
from 0.05 M to 1.0 M in buffer B. Fractions were 
collected and assayed for Mme I endonuclease activity. 
Mme I activity eluted between 0.14 M and 0.17 M NaCl. 

This pool was applied to a 2 liter Superdex 75 
sizing column (Amersham Biotech) which had been 
equilibrated with buffer C (20 mM Tris-HCl, pH 8.0, 500 
mM NaCl, 1.0 mM DTT, 0 . 1 mM EDTA, 5% Glycerol) . 
Fractions were collected between 500 and 1500 ml 
elution with buffer C, then assayed by Mme endonuclease 
assay and polyacrylamide gel electrophoresis on 4-20% 
gradient gel, followed by protein staining with 
Coomassie Brilliant Blue dye. Fractions eluting between 
775 and 825 ml corresponded to Mme I activity and a 
protein band of 105 kDa. These fractions were pooled and 
dialyzed against buffer D (20 mM NaP0 4 (pH 7.0), 50 mM 
NaCl, 1 mM DTT, 5% Glycerol). 

The dialyzed sizing pool was applied to a 16 ml 
Ceramic HTP column (BioRad) which had been equilibrated 
with buffer D. A 32 ml wash with buffer D was followed 
by a linear gradient from 0.02 M to 1.0 M NaP0 4 in 
buffer D. Fractions were collected and assayed by Mme 
endonuclease assay and polyacrylamide gel 
electrophoresis on a 4-20% gradient gel, followed by 
protein staining with Coomassie Brilliant Blue dye. Mme 
I eluted between 0.26 M and 0.3 M NaP0 4 . A portion of 
several fractions containing a single homogeneous 
protein band of 105 kDa were used for protein 
sequencing. The rest of the purified Mmel fractions were 
pooled (6 ml @ .36 mg/ml) and dialyzed against storage 
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buffer (10 mM Tris (pH 7.9) , 50 mM KCl, liriM DTT, .1 mM 
EDTA, 50% glycerol) . The purified Mmel enzyme was stored 
at -20°C. 

Activity determination: 

Samples from 1-4 |xl were added to 50 \il substrate 
solution consisting of IX NEBuffer 1, 32 pM S-adenosyl- 
L-methionine, and 1 |xg DNA (lambda, PhiX174 or pUC19 
DNAs) . Reactions were incubated for 15 minutes at 37° , 
received 20 fil stop solution and were analyzed by 
electrophoresis on a 1% agarose gel. 

Optimized endonuclease activity 

Following purification of Mmel from M. 
methylotrophus, experiments were performed to determine 
the optimal reaction conditions for DNA cleavage, 
Endonuclease activity was found to be significantly 
enhanced by the presence of potassium in the reaction 
buffer. Reactions were performed at 4°C to 37 °C and from 
5 to 60 minutes with no appreciable change in the amount 
of DNA cleavage. Enzyme concentrations at or near 
stoichiometric equivalence to DNA sites were required 
for maximal cleavage. Large excess of enzyme blocked 
cleavage. These findings were used to reassess the 
activity of Mmel and to define a workable endonuclease 
unit. 

Unit definition 

One unit of Mmel is defined as the amount of Mmel 
required to completely cleave 1 iig of PhiX174 DNA in 15 
minutes at 37 °C in NEBuffer 4 (20 mM Tris-acetate, 10 mM 
magnesium acetate, 50 mM potassium acetate, 1 mM 
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dithiothreitol (pH7.9 at 25°C) ) supplemented with 80mM 
S-adenosyl~L-methionine (SAM) . 



EXAMPLE XI 



CLONING THE Mnel ENDONUCLEASE 



1. DNA purification: Total genomic DNA of 
Methylophilus methylotrophus was prepared. 5 grams of 

10 cell paste was suspended in 20 ml of 25% sucrose, 0.05 M 

Tris-HCl pH 8.0, to which was added 10 ml of 0.25 M 
EDTA, pH 8.0. Then 6 ml of lysozyme solution (10 mg/ml 
lysozyme in 0.25 M Tris-HCl, pH 8.0) was added and the 
cell suspension was incubated at 4°C for 16 hours. 25 

15 ml of Lytic mix (1% Triton-XlOO, 0.05 M Tris, 62 mM 

EDTA, pH 8.0) and 5 ml of 10% SDS was then added and the 
solution incubated at 37 °C for 5 minutes. The solution 
was extracted with one volume of equilibrated 
phenol : chloroform: isoamyl alcohol (50:48:2, v/v/v) and 

20 the aqueous phase was recovered and extracted with one 

volume of chloroform: isoamyl alcohol (24:1, v/v) two 
times. The aqueous solution was then dialysed against 
four changes of 2 L of 10 mM Tris, 1 mM EDTA, pH 8.0. 
The dialysed DNA solution was digested with RNase (100 

25 Jig/ml) at 37°C for 1 hour. The DNA was precipitated by 

the addition of l/10th volume 5 M NaCl and 0.55 volumes 
of 2-propanol and spooled on a glass rod. The DNA was 
briefly rinsed in 70% ethanol, briefly air dried and 
dissolved in 20 ml TE (10 mM Tris, 1 mM EDTA, pH 8.0) to 

30 a concentration of approximately 500 jig /ml and stored at 

4°C. 



35 



2. The Mmel endonuclease was purified to 
homogeneity as described in Example I above. 
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3. Amino acid sequences of the MmeX endonuclease 
were obtained for the amino terminus and for several 
internal cyanogen ..bromide digestion products of the Mmel 
polypeptide. The Mmel restriction endonuclease, prepared 
as described in Example I above, was subjected to 
electrophoresis and electroblotted according to the 
procedure of Matsudaira (Matsudaira. J. Biol. Chem. 
262:10035-10038, 1987)), with modifications as 
previously described (Looney, et al. Gene 80:193-208 
(1989) ) . The membrane was stained with Coomassie blue R- 
250 and the protein band of approximately 105 kD was 
excised and subjected to sequential degradation on an 
ABI Procise 494 Protein/ Peptide Sequencer with gas-phase 
delivery (Waite-Rees, et al. J. Bacteriol . 173:5207-5219 
(1991) ) . The amino acid sequence of the first 14 amino 
terminal residues obtained was the following: 
ALSWNEIRRKAIEF (SEQ ID NO: 15) . 

An additional sample of the Mime J endonuclease, 20 
Mg in 20 /il/ was treated with 2 fig of cyanogen bromide 
(Sigma) dissolved in 200 £il of 88% distilled formic acid 
for 24 hours in the dark at room temperature. This 
reaction mixture was evaporated to dryness and 
resuspended in 20 ^1 of loading buffer (1.5M Tris-HCl, 
pH 8.5, 12% glycerol, 4% SDS, 0.05% Serva Blue G, 0.05% 
Phenol Red) at 100°C for 5 minutes. This sample was 
subjected to electrophoresis on a Tris-Tricine 10 to 20% 
polyacrylamide gradient gel (Invitrogen) for three hours 
and then transferred to a polyvinylidene dif luoride 
(PVDF) membrane (Problott, Applied Biosystems Inc.) 
using 10 mM CAPS buffer (lOmM 3- [cyclohexyl amino) -1- 
propanesulfonic acid, 10% methanol, 0.05% SDS, 0.005% 
dithiotheritiol, adjusted to pH 11.0 with NaOH) for 18 
hours at 200 volts in a tank electroblotter (TE52, 
Hoef fer) . The membrane was stained with Coomassie blue 



WO 2004/007670 



PC17US2003/021570 



-28- 



R-250 and major bands of 25 kilodaltons (kD) , 14 kD, 7.5 
kD and 6 kD were observed, as well as smaller bands. 
These stained protein bands were excised from the 
membrane and each subjected to sequential degradation. 
5 The fragments other than the amino terminal fragment are 

derived from internal cleavage by cyanogen bromide at 
methionine residues from within the protein and thus 
should be preceded by a methionine. The first 29 
residues of the 25 kD peptide corresponded to 

10 (M) KISDEFGNYFARIPLKSTXXIXEXNAliQ (SEQ ID NO: 16). Residues 

20, 21, 23 and 25, labeled X, were not identified. The 
first 40 amino acid residues obtained from the 14kD 
fragment were: (M) DAKKRENLGAHYTSEANILKLI 
KPLLLDELWWFXKVKN (SEQ ID NO: 17) . Residue 36 was not 

15 determined. The first 25 residues of the 7.5 kD peptide 

corresponded to (M) KSRGKDLDKAYDQALDYFSGIAER (SEQ ID 
NO: 18) . The 6kD fragment was found to contain a mixture 
of three sequences. 

20 4. Artplif ication of a portion of the Mmel 

endonuclease: The peptide sequence data from the amino 
terminus, 25 kD, 14kD and 7.5kD peptides was used to 
construct a series of degenerate PCR primers 
corresponding to the codons for the amino acid residues. 

25 The order of the internal peptide fragments was unknown, 

so both forward (sense strand) and reverse (antisense 
strand) primers were made for these fragments. The 
primers were: 

30 25 kD fragment: residues DEFGNYFA (SEQ ID NO: 19) 

Forward : 

1 ) 5 1 - G ARTTYGGNAAYT AYTTYGC - 3 1 ( SEQ ID NO : 2 0 ) 



35 



Reverse : 

2) 5 1 - AARTARTTNCCRAAYTCRTC - 3 ' ( SEQ ID NO : 2 1 ) 
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14 kD fragment : residues MDAKKR (SEQ ID NO: 22) 
Forward A: 

3) 5 ' -ATGGAYGCNAARAARCG-3 ' (SEQ ID NO: 23) 
Forward B: 

4) 5 1 -ATGGAYGCNAARAARAG-3 1 (SEQ ID NO:24) 
Reverse : 

5 ) 5 1 -CGNCGYTTYTTNGCRTCCAT- 3 ' ( SEQ ID NO : 2 5 ) 



7.5 kD fragment: residues DKAYDQA (SEQ ID NO: 26) 
Forward : 

20 6) 5 1 -GAYAARGCNTAYGAYCARGC-3 ■ (SEQ ID NO: 27) 

Reverse: 

7 ) 5 1 -GCYTGRTCRTANGCYTTRTC - 3 1 ( SEQ ID NO : 2 8 ) 

where 
Y = T,C 
R = A,G 
H = A,T,C 
30 S = G,C 

N = A,C,G,T 

Primers 1 and 2 are derived from the Mmel 25 kD 
CNBr peptide and were prepared to prime on the sense 

35 strand (1) or the antisense strand (2) of the gene. 

Primers 3 through 5 are derived from the 14 kD CNBr 
peptide and were prepared to prime on the sense strand 
(3 and 4) or the antisense strand (5) of the gene, with 
3 and 4 differing in the codon usage for the arginine 

40 residue. Primers 6 and 7 are derived from the 7.5 kD 

CNBr peptide and were prepared to prime on the sense 
strand (6) or the antisense strand (7) of the gene. 

PCR amplification reactions were performed using 
45 the primer combinations of 1 with 5, 1 with 7, 3 with 2, 
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3 with 7, 4 with 2, 4 with 7, 6 with 2 and 6 with 7. A 
portion of the Mmel gene was amplified in a PCR reaction 
by combining: 

5 80 |ll 10X Thermopol buffer (NEB) 

50 \ll 4mM dNTP solution (NEB) 

4 \xl Mmel genomic DNA (500jig/ml stock) 
16 \il lOOmM MgS0 4 

586 \il dH 2 0 

10 16 \il (32 units) Vent® exo- DNA polymerase (NEB) . 

This master mix was divided into 8 aliquots of 90 
Hi, to which was added 5 \il forward primer (10 /xM stock) 
and 5 \il reverse primer (10 /iM stock). The cycling 
15 parameters were 95°C for 3 minutes for one cycle, then 

95°C for 30 seconds, 46°C for 30 seconds, 72°C for 2 
minutes, for 25 cycles. 

The amplification reactions were electrophoresed on 
20 a 1% agarose gel and analyzed. Major DNA amplification 

products of 450 base pairs (bp) (primers 2 with 4) , 650 
bp (primers 5 with 6) and 1100 bp (primers 2 with 6) 
were obtained. These fragment sizes are consistent with 
the 7.5 kD CnBr fragment being located nearest the amino 
25 end of the protein and approximately 650 bp away from 

the 14kD CnBr fragment, with the 14 kD fragment between 
the 7.5 kD and the 25 kD fragment and adjacent to the 25 
kD fragment. The amplified DNA fragments were gel 
purified and sequenced using the primers that were used 
30 for the amplification. A translation of the DNA sequence 

obtained matched the amino acid sequence derived from 
the purified Mmel endonuclease, indicating that a 
portion of the Mmel endonuclease gene DNA sequence had 
been successfully obtained. 
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5. Determining the DNA sequence for the entire 
Mmel gene and adjacent DNA: The inverse PCR technique 
was used to extend the DNA sequence from both sides of 
the 1060 bp of the Mmel gene obtained above. To 
accomplish this a series of primers matching the Mmel 
gene DNA sequence and oriented for inverse PCR were 
designed and synthesized. Mmel genomic DNA was cut with 
a number of restriction endonucleases and ligated at low 
concentration to generate circular DNA templates. 

A. Mmel genomic DNA was digested with ten different 
restriction endonucleases and then circularly ligated to 
obtain DNA templates to amplify using the inverse PCR 
technique. The restriction enzymes used were: 

BspHI (T/CATGA) 
EcoRI (G/AATTC) 
Hindi II (A/AGCTT) 
HinPlI (G/CGC) 
Mspl (C/CGG) 
Nlalll (CATG/) 
PstI (CTGCA/G) 
SacI (GAGCT/C) 
SphI (GCATG/C) 
Xbal (T/CTAGA) 

Restriction enzyme digests were performed by 
combining : 

5 \ll 10X NEBuf fer recommended for the enzyme (varied 
with enzyme) 

2 \il M. methyl optrophus genomic DNA (1 jig) 
43 \ll dHp 

1 [il (10-20 units) restriction enzyme. 

The reactions were incubated for 1 hour at 37°C. 
The restriction endonuclease was inactivated by heating 
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the reaction to 65°C (80°C for PstI) for 20 minutes. The 
digested DNA was then ligated into circular fragments by 
adding 50 ill 10X T4 DNA ligase buffer, 400 \ll dKp and 3 
H,l concentrated T4 DNA ligase (6000 units, New England 
5 Biolabs, Inc.) and incubating at 16°C for 16 hours. The 

ligated DNA was then extracted with phenol and 
chloroform, precipitated with 2-propanol and resuspended 
in 100 \xl TE buffer. 

10 B. Amplification of DNA adjacent to the 1060 bp 

fragment of the Mmel endonuclease gene: Two pairs of PCR 
primers were designed, one near each end of the 1060 bp 

sequence-obtained— from-direct— PGR-with-degenerate — 

primers. The primer sequences were: 

primer IP 1 : 

5 1 -GTTGGATCCCGCACAGATTGCTCAGG-3 1 (SEQ ID NO : 29 ) 
20 primer IP 2: 

5 ' -GTTGGATCCTACGTTAATCTGAATAAGATG- 3 1 ( SEQ ID NO: 30) 
primer IP 3: 

25 

5 ' -GTTGGATCCTGTTAATCTGAAACGCTGG- 3 1 ( SEQ ID NO : 3 1 ) 
primer IP 4 : 

30 5 ■ -GTTGGATCCTTATACC AAAATGTGAGGTC - 3 1 ( SEQ ID NO : 32 ) 



Inverse PCR reactions were performed on the 10 
circularized templates produced above with the primer 
pairs of IP 1 with IP 2, IP 3 with IP 4, and IP 1 with 
35 IP 3 . The airplif ication reactions were assembled by 

combining : 

80 jil 10X Thermopol buffer (NEB) 
50 ill 4mM dNTP solution (NEB) 
40 40 fxl IP primer (forward) 
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40 ^xl IP primer (reverse) 
16 \il lOOmM MgS0 4 
534 jxl dH 2 0 

16 jxl (32 units) Vent® exo- DNA polymerase (NEB) . 

The master mix was aliquoted into ten tubes of 76 
M-l, to which was added 4 |Xl of the appropriate digested, 
circularly ligated template. The cycling parameters were 
95°C for 3 minutes for one cycle, then 95°C for 30 
seconds, 56°C for 30 seconds, 72°C for 3 minutes, for 25 
cycles. Amplification products were analyzed by agarose 
gel electrophoresis. 

For primers IP 1 and IP 2 with the SphI template 
and the Nlalll template a product of approximately 825 
bp was obtained. For primers IP 3 and IP 4 with the 
BspHI template a product of approximately 800 bp was 
obtained. For primers IP 1 and IP 3 with the EcoRI 
template a product of approximately 1500 bp was 
obtained. These amplified DNA fragments were gel 
purified, sequenced and assembled with that previously 
obtained. The assembled sequence did not contain the 
entire Mmel endonuclease open reading frame. The 
assembled sequence was used to direct synthesis of a 
second group of inverse PCR primer pairs. The sequences 
of these primers were: 

primer IP 5: 

5 ' -TTCAGAAATACGAGCGATGC- 3 * (SEQ ID NO: 33) 
primer IP 6: 

5 1 -GTCAAGCC ATAAACACC ATC - 3 ' (SEQ ID NO: 34) 
primer IP 7: 

5 1 -GAGGGTCAGAAAGGAAGCTG- 3 ' (SEQ ID NO : 35 ) 



WO 2004/007670 



PCT/US2003/021570 



-34- 



primer IP 8: 

5 ' -GTCCAACTAACCCTTTATGG- 3 1 (SEQ ID NO : 36) 

5 

Inverse PCR amplification reactions were performed 
as above. Using primers* IP 5 and IP 6, products were 
obtained from the Nlalll teirplate (approximately 450 bp) 
and the Mspl template (approximately 725 bp) , but not 
10 from the other circular ligation templates. Using 

primers IP 7 and IP 8, products were obtained from the 
EcoRI template (approximately 500 bp) , the SphI template 
(approximately 825 bp) and the BspHI teirplate 
(approximately 750 bp) . These DNA fragments were 
15 sequenced and the sequence was assembled with that 

previously obtained. The assembled sequence did not yet 
contain the entire Mmel endonuclease open reading frame, 
so another round of primer synthesis and inverse PCR was 
performed. Additional DNA templates were generated as 
above, but using the restriction enzymes Apol (R/AATTY) , 
Asel (AT/TAAT) , BsaHI (GR/CGYC) , Mf el (C/AATTG) , Sspl 
(AAT/ATT) and EcoRV (GAT/ATC) to digest M. 
methylotrophus genomic DNA. The sequences of this third 
round of primers were: 

25 

primer IP 9 : 

5 1 -TTCCTAGTGCTGAACCTTTG- 3 1 (SEQ ID NO : 37 ) 
30 primer IP 10: 

5 1 -GTTGCGTTACTTGAAATGAC-3 1 (SEQ ID NO: 38) 
primer IP 11: 

5 ' -CCAAAATGGAACTTGTTTCG-3 ' (SEQ ID NO: 39) 



20 



35 



primer IP 12: 
40 5 • -GTGAGTGCGCCCTGAATTAG- 3 1 ( SEQ ID NO: 40) 



WO 2004/007670 



PCT/US2003/021570 



-35- 



Inverse PCR amplification reactions were performed 
as above. Using primers IP 9 and IP 10, products were 
obtained from the Nlalll template (approximately 425 
bp) , the Mfel template (approximately 750 bp) , the Apol 
template (approximately 800 bp) and the Mspl template 
(approximately 2100 bp). Using primers IP 11 and IP 12, 
products were obtained from the SphI template 
(approximately 875 bp) , the BspHI template 
(approximately 925 bp) and the EcoRI template 
(approximately 950 bp) . These DNA fragments were 
sequenced and the sequence was assembled with the 
sequences previously obtained. Further sequencing was 
performed on the IP 9, IP10 Mspl 2100 bp product using 
three additional primers: 

primer SI: 

5 ' -GCTTCATTTCATCCTCTGTGC-3 • (SEQ ID NO: 41) 
primer S2 : 

5 ' -TAACCGCCAAAATTAATCGTG- 3 ' (SEQ ID NO: 42) 
primer S3 : 

5 ' -CCACTATTCATTACAACACC-3 ' (SEQ ID NO: 43) 

The final assembled sequence (Figure 2) contained 
the entire Mmel restriction gene, as well as 1640 bp of 
sequence preceding the gene and 1610 bp of sequence 
following the gene. 

6. Cloning the Mmel endonuclease gene in E. 
coli: The putative Mmel endonuclease open reading frame 
was identified from the DNA sequence assembly obtained 
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from sequencing the various inverse PCR amplified DNA 
fragments . The beginning of the open reading frame was 
identified on the basis of the match of the predicted 
amino acid sequence at the amino terminus of the open 
reading frame with the sequence determined from the Mmel 
endonuclease protein. The predicted end of the open 
reading frame would allow for the coding of an 
approximately 105 kD polypeptide, which matched the 
observed size of the native Mmel endonuclease- The amino 
acid sequence deduced from translation of this open 
reading frame contained conserved sequence motifs of 
N6mA DNA methyltransf erases . However, no open reading 
frame containing sequence motifs conserved among DNA 
methyltransf erases was observed adjacent to the jy&nel 
endonuclease gene, as had been expected. It was decided 
to try to express the Mmel endonuclease in E. coli 
without having a second methyltransf erase present to 
protect the E. coli host DNA from cleavage. 
Oligonucleotide primers were synthesized to specifically 
amplify the Mmel gene from M. methylotrophus genomic DNA 
for expression in the cloning vector pRRS (Skoglund, 
Gene 88:1-5 (1990)) . The forward primer contained a PstI 
site for cloning, a stop codon in frame with the lacZ 
gene of the vector, a consensus E. coli ribosome binding 
site, the ATG start codon for translation (changed from 
the GTG used by M. methylotrophus to facilitate greater 
expression in E . coli) and 20 nucleotides that matched 
the M. methylotrophus DNA sequence: 

5 ' - GTTCTGC AGTTAAGGATAAGATATGGCTTTAAGCTGGAACG AG- 3 1 
(SEQ ID NO: 44) 

The reverse primer contained a BamHI site for 
cloning and 22 nucleotides that matched the M. 
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methylotrophus DNA sequence 3' to the end of the Lfcnel 
open reading frame: 

5 1 -GTTGGATCCGTCGACATTAATTAATT^ 3 1 

5 (SEQ ID NO:45) 

The Mmel gene was amplified in a PCR reaction by 
combining : 

50 M<1 10X Thermopol buffer (NEB) 
30 Hi 4iriM dNTP solution 
12.5 jxl forward primer {lOpM stock) 
12.5 ill reverse primer (lOpM stock) 
5 \il Mmel genomic DNA (500|xg/ml stock) 
387 ill dH 2 0 

3 \il (6 units) Vent® DNA polymerase 



10 



15 



The reaction was mixed and aliquoted into 5 tubes 
of 80 ixl each. MgS0 4 was added (lOOmM stock) to bring 

20 the final concentration of Mg++ ions to 2mM, 3mM, 4mM, 

5mM and 6mM respectively. The cycling parameters were 
95°C for 30 seconds, 60°C for 30 seconds, 72°C for 3 
minutes, for 24 cycles. The reactions were analyzed by 
gel electrophoresis and the 3mM through 6mM Mg++ 

25 reactions were found to contain a DNA band of the 

desired size of 2.8kb. These reactions were pooled and 
the 2.8kb band was gel purified. The 2.8kb amplified 
Mmel gene fragment was digested with BamHI and PstI 
endonucleases (NEB) in the following reaction 

3 0 conditions : 

15 \ll 10X BamHI reaction buffer (NEB) 
1.5 [ll BSA (NEB) 

50 ill Mmel gene 2.8 kb amplified DNA fragment 
35 80 (Xl dH20 
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5 111 BamHI endonuclease (100 units) 
5 [Li PstI endonuclease (100 units) 

The reaction was mixed and incubated for 1 hour at 
37°C. The small fragments cleaved off the ends of the 
2.8kb DNA fragment were removed, along with the 
endonucleases, by purification on a Qiagen QiaPrep spin 
column according to the manufacturer's instructions. 

The cleaved Mmel gene DNA fragment was ligated to 
the pRRS vector as follows: 10 \xl of the digested, 
purified 2 . 8kb Mmel fragment was combined with 5 \ll pRRS 
vector previously cleaved with BamHI and PstI and 
purified, 5 |Xl d^O, 20 \ll 2X QuickLigase Buffer (NEB) , 
the reaction was mixed, and 2 \il of QuickLigase was 
added. The reaction was incubated at room temperature 
for 5 minutes . 5 jxl of the ligation reaction was 
transformed into 50 |Xl chemical competent E. coli ER2683 
cells and the cells were plated on L-broth plates 
containing 100 \lg/ml ampicillin and incubated at 37°C 
overnight. Approximately 200 trans formants were obtained 
and 18 representatives were analyzed as follows: plasmid 
from each colony was isolated by miniprep procedures and 
digested with AlwNI and Ndel endonucleases to determine 
if they contained the correct size insert. 2 of the 18 
transformants had the correct size insert of 
approximately 2800 bp. Both clones were tested to see if 
they produced Mmel endonuclease activity. The clones 
were grown overnight at 37°C in 500 mL L-broth 
containing 100 (Xg/ml ampicillin. The cells were 
harvested by centrifugation, suspended in 10 mL 
sonication buffer (20mM Tris-HCl, ImM DTT, 0 . ImM EDTA, 
pH7.5)and broken by sonication. The crude lysate was 
cleared by centrifugation and the supernatant was 
recovered. The lysate was assayed for endonuclease 
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activity by serial dilution of the lysate in IX reaction 
buffer NEBuffer 1 (New England Biolabs) containing 20 
Jig/ml lambda DNA substrate and supplemented with SAM at 
100 ]}K final concentration. The reactions were incubated 
for 1 hour at 37°C. The reaction products were analyzed 
by agarose gel electrophoresis on a 1% agarose gel in IX 
TBE buffer. One of the two clones had Mmel endonuclease 
activity. This active clone was designated strain 
NEB1457 and was used for subsequent production of Mmel. 
The plasmid construct expressing Mmel activity in this 
clone was designated pTBMmel.l. 



EXAMPLE III 

THE Mmel ENDONUCIjEASE PROVIDES IN VTVO PROTECTION 
AGAINST MMEI CLEAVAGE 

The plasmid pTBMmel.l was purified from NEB1457 
using the Qiagen miniprep protocol. This plasmid has two 
Mmel sites in the vector backbone, and one site within 
the Mmel gene. The plasmid was digested with Mmel to 
test whether this DNA was resistant to Mmel endonuclease 
activity, which would indicate that the single Mmel gene 
was able to methylate DNA in vivo to protect the host 
DNA against its endonuclease activity. To test this the 
following were combined: 



10 |Xl pTBiy&nel.l miniprep DNA 
15 JJtX 10X NEBuffer 4 
15 jxl SAM (ImM stock solution) 
110 111 dH20 

1 |ll Mmel endonuclease (15 units) 

The reaction was mixed and split in thirds. To one 
third was added 0 . 5 /xl dHp, to the second was added 0.5 
Ml pRRS vector and to the third was added 0.5 /xl PhiX174 
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DNA as a positive control. The pTBMmel . 1 was not cleaved 
by the Mmel endonuclease activity, while the Phixl74 and 
pKRS DNAs in the same reaction were cleaved, indicating 
that the three Mmel sites in the pTBMmel.l DNA are 
resistant to Mmel endonuclease activity (Figure 4) . 

EXAMPLE IV 

Mmel ENDONUCLEASE SENSITIVITY TO METHYLATION 



The prior literature reports that Mmel endonuclease 
methylates just one strand of its recognition sequence, 
and that this hemi-methylation does not block subsequent 
15 cleavage of the DNA by the endonuclease (Tucholski, Gene 

223 (1998) 293-302) . To test this a set of four 
oligonucleotides were synthesized so that a DNA 
substrate could be formed that was either unmethylated 
(oligo 1 + oligo 2) , methylated in the top strand only 
(oligo 3 + oligo 2), methylated in the bottom strand 
only (oligo 1 + oligo 4) , or methylated on both strands 
(oligo 3 + oligo 4) . The oligos synthesized were: 



Oligo 1: 

5 ' -FAM-GTTTGAAGACTCCGACGCGATGGCCAGCGATCGGCGCCTCAGCTTT 
TG-3' (SEQ ID NO:46) 

Oligo 2 : 

5 ' -FAM-CAAAAGCTGAGGCGCCGATCGCTGGCCATCGCGTCGGAGTCTTCA 
AAC-3' (SEQ ID NO: 47) 

Oligo 3 : 

5 ' -FAM-GTTTGAAGACTCCG ( 6mA) CGCGATGGCCAGCGATCGGCGCCTCAGCTT 
TTG-3 ' (SEQ ID NO:48) 
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Oligo 4 : 

5 ' - F AM- C AAAAGCTGAGGCGCCGATCGCTGGCG^TCGCGTCGG ( 6mA) GTCTTCA 
AAC-3' (SEQ ID NO: 49) 

(Other nucleotides outside the Mmel recognition 
sequence were also methylated for other studies, but 
since Mmel does not have any sequence specifity for 
these nucleotides this does affect Mmel activity and 
these other methylations are omitted here for clarity.) 
Duplex DNA was formed by mixing 100/il top strand oligo 
(14/xM stock) with lOO/il bottom strand oligo (14/xM 
stock), heating to 85°C and cooling slowly to 30°C over 
15 a time of 20 minutes. Mmel was then used to cleave the 

oligo pairs in a 30 Ml reaction of IX NEBuffer4 / 2.5 /xM 
oligo, 100 /xM SAM and 2.5 units Mmel. As a control, 
restriction endonuclease Hpyl88l was also used to cleave 
the oligo DNA. The Hpyl88I recognition sequence overlaps 
the first 5 nucleotides of the Mmel recognition sequence 
in this DNA, 5 1 -TCNGA-3 1 and is blocked by methylation 
at the adenine in either strand of the DNA. Mmel was 
found to cleave unmethylated DNA as expected. In 
contrast to previous teaching (Tucholski, Gene 223:293- 
25 302 (1998)) Mmel did not cleave the hemi -methylated DNA 

when the top strand only was methylated: 5 1 -TCCG(N6mA)C- 
3 ' . When the bottom strand only was methylated Mmel did 
cleave the DNA. When both strands were methylated Mmel 
did not cleave the DNA. (Figure 5) This finding is 
consistent with both the observed ability of the single 
Mmel enzyme to protect host DNA against cleavage in vivo 
and the observation that Mmel methylates only the top 
strand of its recognition sequence. We confirmed the 
report that Mmel enzyme methylates only the top strand 
35 of its recognition sequence by methylating the oligo 
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pairs above with tritium labeled H 3 -SAM, washing away 
the unincorporated SAM and counting the radioactivity in 
the DNA. Both the unmethylated oligo DNA and the top 
unmethylated, bottom methylated DNAs had greater than 
5 10-fold more counts than background, while the bottom 

unmethylated, top methylated DNA and the DNA with both 
strands methylated had counts near background (Figure 
6) . These findings indicate that Mmel is a novel type of 
restriction modification system which does not require a 
1° separate methyl transf erase enzyme to modify the host DNA 

to provide protection against the activity of the 
endonuclease, as is the case for the type IIG (also 
called type IV) enzymes such as Eco57I. 
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EXAMPLE V 
DNA SEQUENCING and ANALYSIS 



DNA Sequencing: DNA sequencing was performed on 
20 double- stranded templates on an ABI 373 or ABI 377 

automated sequencer. Amplified DNA fragments and 
individual clones were sequenced with primers 
synthesized as above or from universal primers located 
in the vector. 

25 

Computer analyses: Computer analyses of the DNA 
sequences obtained were performed with the Genetics 
Computer Group programs (Deverenx, et al., Nucleic Acids 
Res. 12:387-395 (1984)) and database similarity searches 
were performed via the internet at the National Center 
for Biotechnology Information site 

(http://www.ncbi.nlm.nih.gov/BLAST/) using the BLASTX 
and the BLASTP algorithms (Altschul, et al., J. Mol. 
Biol 215:403-410 (1990) and Gish, et al., Nature Genet. 
35 3:266-722 (1993)). 
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WHAT IS CLAIMED IS: 



1. Isolated DNA coding for the Mnel restriction 
enzyme, wherein the isolated DNA is obtainable from 

Me thyl ophi 1 us me thyl o trophus . 

2. A recombinant DNA vector comprising a vector 
into which a DNA segment coding for the Mnel has been 
inserted. 

3 . Isolated DNA coding for the Mnel endonuclease 
and Mnel/methyltransf erase, wherein the isolated DNA is 
obtainable from ATCC Accession No. PTA-4521. . 

4. Vectors that comprise the isolated DNA of 
claim 3 . 

5. A host cell transformed by the vector of claim 
2 or 4. 

6. A method of producing recombinant Mnel 
restriction endonuclease and Mnel methylase conprising 
culturing a host cell transformed with the vector of 
claims 2 or 4 under conditions suitable for expression 
of said endonuclease and methylase, 

7. Isolated DNA coding for an Mnel-like restriction 
enzyme, wherein said isolated DNA hybridizes to at least 
one conserved motif of the nucleotide sequence coding 
for the Mnel restriction enzyme under predetermined 
conditions. 
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FIGURE 2: Mmel DNA SEQUENCE 

1 GAATTCCAGA TAGGTAGTCC TTTGGTACTT CCATCCCAAC CAGTGTCACG 

51 TTCCGCGCCA AACCAATCGG TTAAAGTGTA AGAAAGTCTT GCACTGAAGT 

101 AGCTGTAGGA CAAACCGAAG TTAACCTCTG TGGTATCCGA GCGACCACCT 

151 TTAGGTGTTT GACGGAAGCC TGCTGCGTCA CCTGCCAAGT TATATTTCTT 

201 CCATGAACCA CCTGGGTACA GGTAGCTGAT CAAACCAGCA GTCCAACCCA 

251 AGCCTTCAAT AGCAGGAATA GTTCCGTTAT ACCCACCATA AATATCAATT 

3 01 TCGGCAGTTG CATCAGGGAA GGTATTTGGT GTC AC GTTTG AACCCCATGC 

3 51 ACCGACATAA AAGCCGCTGT CATGAGTAAT ATCAATACCG CCTTGAACGG 

401 CAGGTTTGTG CCAGTTTTGT GAAATACCAC GAGCATAGTA ATCTGAAACA 

451 AATCCAACGT TTGCAGTAGC AGCCCAGGCT GATTTTTCTT CTTTAGCCTC 

501 TTCAGCTGCG TATGAAACTT GGGCAAAAGA TAATGTGCTT AACACTGCTG 

551 1*GAGCAATAT AGATTGACGC ATTAIGAG'fC CTCICTC'DGT GAAATCTTTG 

601 ATTAAGTTGT TGTAAAC GAG AATGAAACAA CAACCACAAA GCAAAGCACG 

651 TGCCAAACTA TAAATAACAT TATAATCAAT TATTTAAAAT ATATTTATAA 

701 TCTAAAATAT TAAATTAATT ATTTAATAAA CTGTTTTTTA TTGATTTAAC 

751 TCTAAAACAT ATGGGTGCAA CCACCCTTTT TACTCACTGA TAATGCTAAN 

801 ATAGCCAACA AAGGAGCCTT CACCATGCTG ATTTCAAATG AAAAAATTCA 

851 GGAATTATCT TTAAAAATCA AACAACTAAT CGAATCAAGC CCCATTTCAG 

901 AGCTAAATAA CAACTTGCAT GCACTAATTC AGGGCGCACT CACCAAAATG 

951 GAACTTGTTT CGCGTGAAGA ATTCGATATC CAATCTGCAT TATTAGCGCG 

1001 CACGCAAGAG CAATTAAAAC GTCTTGAAGA AAAAATCAGC CAGCTTGAAG 

1051 AAGGGCAGGC ATCCAGAAAG TAAAAATTAA TTTACAATTG TTAGCATTCC 

1101 ATTATTGAGG AGTGCGCTAT GAGTCTGGCG GTGTTATACA GTCGCGCGTT 
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1151 AAGCGGCATG GAGGCGCCAG 

1201 GACTACCCAG CTTTACCATT 

1251 CATTTTTTCA AATATACAAA 

1301 AACAACTCCC TGCAGACTAG 

1351 GAGTTTCAAA GTCGAATCTT 

1401 CTGACCCTCT TGCACCAAAT 

1451 TCCAATAATG • TCCAACTAAC 

1501 AATG ATTAAG ATGAATTC AA 

1551 CAAAAAGGCA GCCCAGTGCT 

1601 TCCAATTTCA AATAATTTAA 

1651 GCTGGAACGA GATAAGAAGA 

1701 GACGCCTCAG ATGAAAACAG 

1751 CGAAGTTTTT GGAATAACTA 

1801 TCAAAAAGTT CGCCAAGGCC 

1851 TTGTTTTGGC CTGGCATTCT 

1901 CCTCGACAAA GCGTATGACC 

1951 AAAGAGACTT ACCCAGATAC 

2001 TTAACAGACC TAATAACAAA 

2051 ATACCAAAAT GTGAGGTCTT 

2101 TAATCAAGCC ACAAGACCCT 

2151 AAGCTTCATG ACACCCTGAA 

2201 ACTTTATCTA GTGCGTTTAC 

2251 TTTTTGAGAA AAGTTTATTC 

2301 GACGGCAGTG ACCTTGCACA 

2351 TACCCCAGAA CAAAAAAGAT 
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AAGTGGTGGT AGAAGTCCAC TTGGCGAATG 
GTTGAAACAT ATTGAAACTT TAAGCCTTAG 
TGCCCCAAGC TGGTGCATTA AGAAGAATGT 
GAATAACTTC ATGATTTAAC GAACATCCCT 
CTCGTGTTGC AAATTTCTAC AGCTTCCTTT 
TGCACTATGG CGCTAATAAA TCTTCTGCTA 
CCTTTATGGA CTCTTAAAAA AGATTTAATA 
GGAATTTGAT GCCTGGAAAT ATGGCAAAAG 
GACTTTTTTG TTTTAACATT GGCCCATATA 
AAATTATCGG GAGCTAATCT GTGGCTTTAA 
AAAGCTATTG AGTTTTCTAA AAGATGGGAA 
TCAAGCCAAA CCCTTTTTAA TAGATTTTTT 
ATAAGAGAGT TGCAACATTT GAGCATGCTG 
CATAAGGAAC AATCTCGAGG ATTCGTAGAT 
TCTTATTGAA ATGAAAAGCA GAGGTAAAGA 
AGGCACTTGA TTACTTTTCT GGCATTGCAG 
GTTTTAGTTT GCGACTTCCA GCGTTTCAGA 
AGAGTCAGTT GAATTTCTTT TAAAGGACTT 
TTGGTTTTAT AGCTGGTTAT CAAACTCAAG 
ATTAATATTA AGGCGGCTGA ACGGATGGGT 
GTTGGTTGGA TATGAGGGAC ACGCTTTAGA 
TTTTTTGCTT ATTCGCAGAA GACACAACTA 
CAAGAATATA TCGAGACAAA GACGCTAGAG 
TCATATCAAT ACACTTTTTT ATGTTCTCAA 
TAAAGAATCT AGACGAACAC CTTGCTGCAT 
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2401 TTCCATATAT CAATGGAAAA CTTTTCGAGG AGCCACTTCC GCCAGCTCAG 
2451 TTTGATAAAG CAATGAGAGA GGCATTGCTT GACTTGTGCT CATTAGATTG 
2501 GAGCAGGATT TCACCAGCAA TATTTGGAAG TTTATTCCAA AGCATTATGG 
2551 ATGCTAAAAA GAGAAGAAAT CTTGGGGCAC ACTACACCAG CGAAGCAAAT 
2601 ATTCTCAAGT TAATCAAGCC ATTGTTTCTT GACGAGCTCT GGGTAGAGTT 
2651 CGAGAAAGTT AAAAATAATA AAAATAAATT ACTAGCGTTC CACAAAAAAC 
2701 TAAGAGGACT TACATTTTTC GACCCTGCAT GCGGTTGCGG AAATTTTCTT 
2751 GTAATCACAT ACCGAGAACT AAGACTTTTA GAAATTGAAG TGTTAAGAGG 
2801 ATTGCATAGA GGTGGTCAAC AAGTTTTGGA TATTGAGCAT CTTATTCAGA 
2851 TTAACGTAGA CCAGTTTTTT GGTATCGAAA TAGAGGAGTT TCCCGCACAG 
2901 ATTGCTCAGG TTGCTCTCTG GCTTACAGAC CACCAAATGA ATATGAAAAT 
2951 TTCAGATGAG TTTGGAAACT ACTTTGCCCG TATCCCACTA AAATCTACTC 
3001 CTCACATTTT GAATGCTAAT GCTTTACAGA TTGATTGGAA CGATGTTTTA 
3 051 GAGGCTAAAA AATGTTG CTT CATATTAGGA AATCCTCCAT TTGTTGGTAA 
3101 AAGTAAACAA ACACCGGGAC AAAAAGCGGA TTTACTATCT GTTTTTGGAA 
3151 ATCTTAAATC CGCTTCAGAC TTAGACCTAG TTGCTGCTTG GTATCCCAAA 
3201 GCAGCACATT ACATTCAAAC AAATGCAAAC ATACGCTGTG CATTTGTCTC 
3251 AACGAATAGT ATTACTCAAG GTGAGCAAGT ATCGTTGCTT TGGCCGCTTC 
3301 TGCTCTCATT AGGCATAAAA ATAAACTTTG CTCACAGAAC TTTCAGCTGG 
33 51 ACAAATGAGG CGTCAGGAGT AGCGGCGGTT CACTGCGTAA TTATCGGATT 
3401 TGGGTTGAAG GATTCAGATG AAAAAATAAT CTATGAGTAT GAAAGTATTA 
3451 ATGGAGAACC ATTAGCTATT AAGGCAAAAA ATATTAATCC ATATTTGAGA 
3501 GACGGGGTGG ATGTGATTGC CTGCAAGCGT CAGCAGCCAA TCTCAAAATT 
3551 ACCAAGCATG CGTTATGGCA ACAAACCAAC AGATGATGGA AATTTC CTAT 
3601 TT AC TG AC G A AGAAAAAAAC CAATTTATTA CAAATGAGCC ATCTTCCGAA 
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3651 AAATACTTCA GACGGTTTGT GGGCGGGGAT GAGTTCATAA ACAATACAAG 

3701 TCGATGGTGT TTATGGCTTG ACGGTGCTGA CATTTCAGAA ATACGAGCGA 

3751 TGCCTTTGGT CTTGGCTAGG ATAAAAAAAG TCCAAGAATT CAGATTAAAA 

3801 AGCTCGGCCA AACCAACTCG ACAAAGTGCT TCGACACCAA TGAAGTTCTT 

3851 TTATATATCT CAGCCGGATA CGGACTATCT GTTGATACCT GAAACATCAT 

3901 CTGAAAACAG ACAATTTATT CCAATTGGTT TTGTTGATAG AAATGTCATT 

3951 TCAAGTAACG CAACGTATCA TATTCCTAGT GCTGAACCTT TGATATTTGG 

4001 CCTGCTTTCA TCGACCATGC ACAACTGCTG GATGAGAAAT GTAGGAGGAA 

4051 GGTTAGAAAG TCGTTATAGA TATTCTGCCA GCCTGGTTTA CAACACGTTT 

4101 CCATGGATTC AACCCAACGA AAAACAATCG AAAGCGATAG AAGAAGCTGC 

4151 ATTTGCGATT TTAAAAGCTA GAAGCAATTA TCCAAACGAA AGTTTAGCTG 

42 01 GTTTATACGA CCCAAAAACA ATGCCTAGTG AGCTTCTTAA AGCACATCAA 

4251 AAACTTGATA AGGCTGTGGA TTCTGTCTAT GGATTTAAAG GACCAAACAC 

13 01 AGAAATTGCT CGAATAGCTT TTTTGTTTGA AAC AT AC C AA AAGATGACTT 

4351 CACTCTTACC ACCAGAAAAA GAAATTAAGA AATCTAAGGG CAAAAATTAA 

4401 TTAATGTATT TAACATTAAA CCACCCTGAT TTATTTCGAA TAGTTCAAAT 

4451 GCTTCCATGT GGACTAATCG CCTTCAATCA TATTAAAAAA CCGACGCTAG 

4501 TAATAAAAAC TTCCAAAGAG GCCATATTAA CCGCCAAAAT TAATCGTGAA 

4551 TTTAAAATAT ATCTTTATCA AACCACATCG GCTTGTGTTC TAGTAAGTGC 

4601 ATTTTTTGAC GATTCTGATA GTCCACTATT CATTACAACA CCAATTGTTC 

4651 GAGATGACCA ACACTCCTTA GACTTGTTAA GATTTTTAAT CAACAATGAT 

4701 TTTACGATTT GCTTCTTTGA TGAACTGAAC CGAGAATTTC TTTCCGTTAA 

4751 CGCAACTGGT AATTTAGTCT CTATCTTTGA GAGCATTCAC TTGATGCCAC 

4801 TGCCGAGCCC AGAGGAAGCC CACAATGCAT TGAATGAAGC GGAATTTTGG 

4851 TTCAGTTTAC GCTCAGCTGC TGATGATGAA TCATCTATCC AGGTTTCTTT 
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4901 ATTGGATAAT CTATTTCCTG ACGATTTTGT AATTTATGAC CTATCCTCAA 

4951 ACAAAAACGA TATGACATCA TTGGTTAGAG AAACTAAACC AGGATACTAT 

5001 CAGGAAGCAG ATATTGCAAA GTTACTAACA AGAGCTTTTA GTTTGGAAAG 

5051 CATTTATCAG AATCCAGTGA AAACAAGCGA TTCAAAAGAG TTGGCAGACG 

5101 TTGTGGTATT CGGCCAAAAG GAAATTTTAA TAATTCAAGC TAAAGATAGT 

5151 GAAAACAATC AGAAACAAGT TTTAGAGGTT TCGTTAGACA AGAAATGCGC 

5201 AAAGTCTTCA AAGAAACTTT CTGAAGCTTT GGCACAACTC ACCGACACTA 

5251 TCTTAACAAT ATCCAATACA CCAATAGTTG ATGTTCGGGT TGGTAAGAAA 

5301 AAATGCACTC TGAACTTTGA GGGAAAGCAG CTTATTGGTA TCGTCGTTGT 

5351 TAAAGAGCTT TTTAATGATA TTTACGATAA .ATACAGTCAA AAAGTTTTTG 

5401 AGCATGTAGA GTTGTCTAAA GCACCCATTG TCTTCTTTGA CTATCCAGAA 

5451 TTTGCAAGAA TGACATTTCA TTGTAATTCT GAGGAATTAT TACTTTATGC 

5501 TTTGCATAGG ATATTTAGTT CTGCAATAGA AAATGGAATG TATAAACGAT 

5551 TGAGATTTAC TCAACCTATC ATAACTGATG GTCATCACAG CTACTTCAGG 

5601 ATAGAAAACA GGCCCCATTC TGATGAGGCC TATTTAATTT GCACAGAGGA 

5651 TGAAATGAAG CTCTCAAATA AGTTTAAAGA CTAAATTTAT ATTTTCCTCA 

5701 GTATCTTAAA AACAATATTC ATTAAATTGG AAAGCCCGCA ATGATTGTTG 

5751 CAGTATCAAT GCGGGCATCA GTATCCAGCT CTTGCAATAC ACGGAAGTAT 

5801 CAAGAAGCGA ATCAGGATTC TAACCATACC TTTTTAATTG CAACAATCTA 

5851 ATTTCCATAA CATGTGTAGC TACATCGAAA AAAAGACCTC GAAGAGGTTG 

5901 CAAGAGCGTC CAGCTCGCGG CATCAAAAGA CCCTAGTCTT TTGACAAGGG 

5951 GGAGCCAAAA AACTGAGGTG GAGGAGCTTG CCGACGAAGC CAGGAAGCCC 

6001 CAGCGTCCGG 
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FIGURE 3: Mmel AMINO ACID SEQUENCE 

1 MALSWNEIRR KAIEFSKRWE DASDENS QAK PFLIDFFEVF GITNKRVATF 

51 EHAVKKFAKA HKEQSRGFVD LFWPGILLIE MKSRGKDLDK AYDQALDYFS 

101 GIAERDLPRY VLVCDFQRFR LTDLITKESV EFLLKDLYQN VRSFGFIAGY 

151 QTQVIKPQDP INIKAAERMG KLHDTDKLVG YEGHALELYL VRLLFCLFAE 

201 DTTI FEKSLF QEY I ETKTLE DGSDIiAHHIN TLFYVLNTPE QKRLKNLDEH 

251 LAAFPYINGK LFEEPLPPAQ FDKAMREALL DLCSLDWSRI SPAIFGSLFQ 

3 01 SIMDAKKRRN LGAHYTSEAN ILKLIKPLFL DELWVEFEKV KNNKNKLLAF 

351 HKKLRGLTFF DPACGCGNFL VITYRELRLL EIEVLRGLHR GGQQVLDIEH 

401 LIQINVDQFF GIEIEEFPAQ IAQVALWLTD HQMNMKISDE FGNYFARI PL 

451 KSTPHILNAN ALQIDWNDVL EAKKCCFILG NPPFVGKSKQ TPGQKADLLS 

501 VFGNLKSASD LDLVAAWYPK AAHYIQTNAN IRCAFVSTNS ITQGEQVSLL 

551 WPLLLSLGIK INFAHRTFSW TNEA3GVAAV . KCVXIGFGLK DSDEKIIYEY 

601 ESINGEPLAI KAKNINPYLR DGVDVIACKR QQPISKLPSM RYGNK PTDDG 

651 NFLFTDEEKN QFITNEPSSE KYFRRFVGGD EFINNTSRWC LWLDGADISE 

701 IRAMPLVLAR IKKVQEFRLK SSAKPTRQSA STPMKFFYIS QPDTDYLLIP 

751 ETSSENRQFI PIGFVDRNVI SSNATYHIPS AEPLIFGLLS STMHNCWMRN 

801 VGGRLESRYR YSASLVYNTF PWIQPNEKQS KAIEEAAFAI LKARSNYPNE 

851 S IxAGL YD PKT MPSELLKAHQ KLDKAVDSVY GFKGPNTEIA RIAFL FETYQ 

901 KMTSLLPPEK EIKKSKGKN* 
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FIGURE 4: pTBMmel.l IS RESISTANT TO Mmel CLEAVAGE 
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FIGURE 5: Mmel CLEAVAGE OF HEMI - METHYLATED SUBSTRATES 



12 3 4 
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8 9 10 11 12 13 14 



WO 2004/007670 



PCT/US2003/021570 



10/17 



FIGURE 6 s METHYLATION INCORPORATION BY Mmel ENDONUCLEASE 



Top Strand: / 

5 ' -TCCGAC-3 ' / 

unmethylated: / 

unmethylated: / 

methylated: / 



methylated : 



/ 



Bottom strand: 
5 ■ -GTCGGA-3 1 

uruue thy 1 a ted 

methylated: 

unmethylated : 

methylated: 



3 H-COUNTS 

19,972 
14,447 
1,266 
917 



'A' indicates position of N6-methyl adenine in the DNA 
substrate 
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Mmel Figure 7; Multiple Sequence Alignment of Mmel and Homologous 
Polypeptides 

PileUp of: @mme.list2 
Symbol comparison table: GenRunData:blosum62 .cmp CompCheck: 1102 

GapWeight: 6 
GapLengthWeight : 1 

Name: mmelfeLORF3P = gi | 28373198 | ref |NP_783835 . 1 1 (SEQ ID NO:3) 
Name: mmeLrel21P = gi | 23451826 | gb|AAN32874 . 1 |AF461726_1 (SEQ ID NO:4) 
Name: mme = Mmel amino acid sequence (SEQ ID NO: 5) 
Name: mmeNMA1791 = gi [ 15794682 | ref |NP_284504 . 1 1 (SEQ ID NO:6) 
Name: mmeBSU0677 = gi 1 16077744 | ref |NP_388558 . 1 1 (SEQ ID NO: 7) 
Name: mmegcry = gi j 9945797 | gb|AAG0337i . 1 1 {SEQ ID NO:3) 
.Name: mmePflQ8 = gi | 23451826 | gb | AAN32874 . 1 |AF461726_1 (SEQ ID NO: 9) 
Name: saro3834 = gi j 23110638 | gb j ZP_00096791 . 1 | (SEQ ID NO:10) 
Name: mmeMSI135 = gi | 20803963 | emb | CAD31540 . 1 1 (SEQ ID NO:ll) 
Name: mmeCC0826 = gi j 16125079 jref |NP_419643 . 1 | (SEQ ID NO:12) 
Name: mmeDR0119.1 = gi 1 15807788 | ref |NP_285443 . 1 1 (SEQ ID NO: 13) 
Name: mmeDR2267 = gi 1 15807258 | ref |NP_295988 . 1 j (SEQ ID NO:14) 



mmelfeLORF3P 
mmeLrel21P 
mme 

mmeNMAl791 
mmeBSU0677 
mmegcry 
mmePtlQ8 
saro3834 
mmeMSI135 
mmeCC0826 
mmeDR0119.1 
mmeDR22 67 



mmel f eLORF3 P 
mmeLrel21P 
mme 

mmeNMA1791 
mmeBSU0677 
mmegcry 
mmePf 1Q8 
saro3834 
mmeMSI135 
mmeCC0826 
mmeDR0119.1 
mmeDR2267 



50 

-MPT RQQAAREFVK TWS . SDKKGR EDADRQTFWN 



MALSWNE IRRKAIEFSK RWE . DASD . . ENSQAKPFLI 

MKTLLQ LQTAAQNFAA YYK.DQTD. . ERREKDTF * N 

— MALID LEDKIAEIVN R.E.DHSD FLY 

MVMAPTTVFD RATIRHNI/TE FKLRWLDRIK QWEAENRPAT ESSHDQQFWG 



-MSLGAAGL TPITPAAFIK KWRKSELG . . ERQAAQEHFL 

MTPAQFVK KWSDSQLR . . ERQASQEHFL 

— MHPQEFAD TWSRRALKAT ERDSYVQHWL 

MPQTE TAQRMEDFVA YW..RTLKGD EKGESQV.FL 



51 100 
DLLQRVYGID N.YYDYITYE KDVQVKADGK VTTRRIDGYI P . STKIMVEM 

DFFE . VFGIT N.KR. . .VAT FEHAVKKFAK AHKEQSRGFV DL . . .FWPGI 
EFFA.IFGID R.KN. . . VAH FEYPVKD. .P ADNTQ. . . FV DI . . . FWEGI 

ELLG.VYDVP R.AT...ITR LKK.GN QNLTKRVGEV HLKNKVW. . . 

DLLDC.FGV. N.ARDLYLY QRSAK RASTGRTGKI DM...FMPGK 

, M 

D.ICSLVGHP SP.SDEDPTG AFFAFEKGAN KLG.GGKGFA D.VWK. .KGH 
D.LCRMLEVP TP . AEDDPLG ERYCFERGAA KTG . GGDGWA D.VWR..KGC 

D . LCQLLHHE APGADPD YKFERRVT KVGTKDKGFA D . VFK . . KAH 

DRLFQAFGH. . . . AGYKEAG AE. .LEYRVA KQG . GGKKFA DLLWR. . PRV 



mmelfeLORF3P 
mmeLrel21P 
mme 

mmeNMA1791 
mmcsBSU0677 



101 



KGKNIKDLSK PITQSGGD ELT PFEQAKRYAN FLPN. 



150 
. SEQ 



LLIEMKSRGK DL D KAY D . . QALDYFS GIAERD...L 

FLAEHKSANK NL T KAK E . . QAERYLQ EIGRTKPSAL 

. FKEAK . KGK" LF . ..WD.. . ALI DIEQQVEYL . . SAK 
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mmegcry VIGEAKSLGV PLDDA YAQALDYLL G.GTIANSHM 

mmePf 1Q8 , , r _ 

saro3834 NPVEIEEAVS DLARAPYDAS EFPFQFLAAF GNKQTTLQRL RAGNSNQSDL 

mmeMSI135 FAWEYKRKKG NLDEA LLQLMRYAP AL 

. mmeCC0826 FGWEYKGKHK NLDAA LRQLQAYAL DL 

mmeDR0119.1 FITEYKRPGS DLGAA . LQQATLYSR DL 

mmeDR2267 LI.EMKKRGE KLANH . . YQQAFDYWL KL 



mmelfeLORF3P 
mmeLrel21P 
mme 

mmeNMA1791 
mmeBSU0677 
mmegcry 
ramePflQS 
saro3834, 
mmeMSI135 
mmeCC0826 
mmeDR0119 . 1 
mmeDR2267 



151 
PR. . 



WILVSNFNEI DIHDM . . E . . . RPLDEPKVT KL . 



PR. 
PE. 
PR. 
PA. 



YVLVCDFQRF RLTDLITK . . . ESVE . . . . F LL . 
YYAVSDFAHF HLYRRVPE . . . EGAENQWQF PL . 

YLDVTDYDGV LAKDTKTL . . . EALDVKF 

YWCSNFETL RVTRLNRTYV GDSADWDITF PL. 



200 



PGAVLQRNHI HIATCDAGNV DRTLAALRKS PKTASQKARF I LATDGVAFQ 

. . . .L. .SPP LHIVCDI ERL RIHTAWTNTV PSTY. .VITL DDLAE 

. ...Q..NPP YXiWSDMERI IVHTNWTNTI SRKI..EFTL DDLHE 

. . . .G. .NPP LLLTSDFQRI EINTAFTGTS PKSY. . LITL DDIAENRWG 
VPDRPR YAVLCNFDEL WVYDF NQQ L DEPMDRLRI . 



mmelfeLORF3P 
mmeLrel21P 
mme 

mmeNMA1791 
mmeBSU0677 
mmegcry 
mmePf 1Q8 
saro3834 
mmeMS113 5 
mmeCC0826 
mmeDR0119 . 1 
mmeDR2267 



201 250 
. EDLPKKVKS L E F MVDA NQQQVIDEKQ LSVDAGNLVA 

.KDLYQNV.R S FGFIA. . .GY QTQVIKPQDP INI KAAERMG 

. EELPEYITR G V FDFMF...GI EAKVRQIQEE ANI QAAATIG 

. EELPQY FDFFLAWKGI EKVEFEKENP AD I KAAERF A 

. AEIDEHIEQ L A F LADY ETSAYREEEK ASLEASRLMV 

AEDMASGETV ACNYAAFPDK FAFFLPLAGI TTVQQIRESS FDIKATGRLN 

PSAREM LHNVFFSPEK L RPTR..TRAA VTKEAADKF S 

. . . . PEKLAM LRQVFDGSDS L KPKI . . SPQE LTAKVAQRFG 

GNDVP . ALQI LHSALHQPYD L DPRL . . FRER ITTDATRQVG 

. EELPERYTV LNFMFEQ..E R APLFGNNRVD VTREAADSVA 



mmelfeLORF3P 
mmeLre!21P 
mme 

mmeNMA1791 
mmeBSU0677 
mmegcry 
mmePf 1Q8 
saro3834 
mmeMSI13 5 
mmeCC0826 
mmeDR0119 . 1 
mmeDR2267 



251 

KIYNELTNAY AAGRGIDVN. 



300 

. EPRIQRS LN. .MLIVRL VFLLYADDSN 



KLHDTL.K LVG YE. GHA LE..LYLVRL LFCLFAEDTT 

RLHDAL . K EEG . ... IYE . EHE LR . . LFITRL LFLFFADDSA 

RIYDVLRK ENN . ... IIETNRG LD . . LFLIRL LFCFFAEDTD 

ELFRAMNGDD VDEAVGDDAP TTPEEEDERV MRTSIYLTRI LFLLFGDDAG 

KLYVELLKDN PDWA SRS EDMNHFMARL. IFCFFAEDTD 

AIALRVQGR . G . TPD EIAHFVNQL VFCFFAQSVS 

DLGRRLQER. GHHPR DVAHFLNRV VFCMFAEDAK 

LVARRLGERE GRT RAAHMMMRV VFALFAEDTG 

KVLNSVIAR. GEDRA RAQRFLLQC VMAMFAEDFE 



mmelfeLORF3P 
mmeLrel2lP 
mme 

mmeNMA1791 
mmeBSU0677 
mmegcry 
ivinePf IQ8 



301 350 

LFGKEDI FQA FIER...REP RDIRRDLSEL FKVLDQP . EE QRDPYLDDEF 

IFEKS.LFQE YIETKTLEDG SDLAHH INTL FYVLNTP.EQ KRLKNLDEHL 

VFRRNYLFQD FLE..NCKEA DTLGDKLNQL FEFLNTP . DQ KRSKTQSEKF 

IFKRNS.FTN LIKTLTEEDG SNLNKLFADL FIVL DK NERDDVPSYL 

LWDTPHLFAD FVRNETTPE . . SLGPQLNEL FSVLNTA.PE KRPKRLPSTL 
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saro3834 
mmeMSI135 
mmeCC0826 
mmeDRO 119.1 
mmeDR2267 



mmelfeLORF3P 
mmeLrel2lP 
mme 

mmeNMA1791 
mmeBSU0677 
mmegcry 
mmePflQ8 
saro3834 
mmeMSI135 
iraneCC0826 
mmeDRO 11 9.1 
mmeDR2267 

mmelfeLORF3P 
mmeLrel2lP 
mme 

mmeNMA1791 
mmeBSU0677 
mmegcry 
mmePf 1Q8 
saro3834 
mmeMSI135 
mmeCC0826 
mmeDR0119.1 
mmeDR2267 



mmelfeLORF3P 
mmeLrel21P 
mme 

mmeNMA1791 
mmeBSU0677 
mmegcry 
mmePflQ8 
saro3834 
mmeMSI135 
mmeCC0826 
mmeDR0119.1 
mmeDR2267 



mmelfeLORF3P 
mmeLrel2lP 
mme 

mmeNMA1791 
mmeBSU0677 
mmegcry 
mmePf 1Q8 
saro3834 
mmeMSI135 
inmeCC082b 



IFVGEGLFSR TVETMSARDA SDTHMVIAEI FRAMDTRLAD RAAAGIKSWA 

LLPD.GLFTK LLK.RSARAP ERAMSYLDKL FEAME RGGEF. . . DL 

LLPE.GLFTR LTRSMQMRPP AEAAPQFDAL FAMMR AGGMF. . .GA 

MLER.GIVTR LLE . RARAPP GEDQLYFQDL FGAMK GGGEF . . . WG 

LIPR.GFFTE L ADD . ARAGR GSSFDLFGGL FRQMNTS ERA RGGRF. ! . . . 

351 . 400 

NQFAYVNGGM FSDENVIIPQ FTDELKRLIV EDAGRGFDWS GISPTIFGAV 

AAFPYINGKL FEEPLPPA . Q FDKAMREALL .DLCS.LDWS RISPAIFGSL 

KGFEYVNGGL FKERLRTF.D FTAKQHRALI .DCGN.FDWR NISPEIFGTL 

KEFPYVNGQL FTEPHTEL . E FSAKSRKLII . ECGELLNWA KINPDIFGSM 

AKFPYVNGAL FAEPLAS . EY FDYQMREALL AAC..DFDWS TIDVSVFGSL 

DVFPYVNGQL FSGS.TECPR FSKIARSYLL H..IGSLDWQ KINPDIFGSM 

TDXTWFNGGL FDGR. .RALR LDDGDIGLL . .VAADSLDWG LIDPTIFGTL 

DIVHWFNGGL FDEK. .PALP LERADIKLIH DTAAEH . DWS DLDPSVFGNM 

TDIRHFNGGL FDSE. .DALA LTSEDAAAL. . IIAAKLDWS EVEPSIFGTL 

APIPYFNGGL FRAV. .DPIE LNRDELYLLH KAALEN.NWA RIQPQIFGVL 

401 450 

FESTLN. PET RRSGGMHYTS IENIHKVIDP LFLNDLHDEF D 

FQSIMD. AKK RRNLGAHYTS EANILKLIKP LFLDELWVEF E 

FQSVMD.AQE RREAGAHYTE AANIDKVING LFLENLRAEF E 

IQAVAS.EES RSYLGMHYTS VPNIMKVTKP LFLDKLNQSF 
FQLVKS.KEA RRSDGEHYTS KANIMKTIGP LFLDELRAEA D. ! . ] 



IQAVAD . DEE RGALGMHYTS VPNILKVLNP LFLDDLRAKL E 

FERFLD . PEK RAQIGAHYTD PEKIMRLVDP VILRPLRQEW EQARREIVEL 
FEEALKATRE RAALGAHYTD REKILKIIDP VITWPLMAQW ETALAEIRAA 

FENSLDV.DT RSRRGAHYTS VNDIERIVDR WMEPLWAEW D 

FQSSMDKKEQ HAK . GAHYTS EADIMRWLP TIVTPFQRQI EAATTQ . . . ! 

451 500 
KIQNMG 

' KVK. . . 

" AVK. .A 

L 

KL. .VS 



LNGN RKPPMRR . . . QQSRR MKREEAA 

LDARAAAEAE RKAVLEAAAE AMRADPVKAK AGEAARRKTL TAIAKRSDAA 
ALRLSL PELKK 



NRRQRVTRAK AFRDKLGKLK FFDPACGSGN FLTETYLSLR KMENECLRII 

. . -NNKWKLL AFHKKLRGLT FFDPACGCGN FLVITYRELR LLEIEVL . RG 

LKRDKAKKLA AFYQKIQNLQ FLDPACGCGN FLIVAYDRIR ALEDD I IAEA 

DAYDDYTKLE NLLTRIGKIK FFDPACGSGN FLIITYKELR RMEINI IKRL 

SPSTSVAALE RFRDSLSELV FADMACGSGN FLLLAYRELR RIETDIIVAI 

. AGDNSRKLL NLRNRMAKIR VFDPACGSGN FLVIAYKQMR ELEAEI 

.AEVR.SR.. . FTERLRKLR ILDPACGSGN FLYLALQGVK DIEHRANLDC 

LG^AK.DRLE APLSku^FR VLDPACGSGW f jj'iVALhALi; DIERRALVDA 
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mmelf eLORF3P 
mmeLrel21P 
mme 

mmeNMA1791 
mmeBSU0677 
mmegcry 
mmePf 1Q8 
saro3834 
mmeMSI135 
mmeCC0826 
mmeDR0119.1 
mmeDR2267 



mmelfeLORF3P 
mmeLrel21P 
mme 

mmeNMA1791 
mmeBSU0677 
mmegcry 
mmePf 1Q8 
saro3834 
mmeMSI135 
mmeCC0826 
mmeDR0119.1 
mmeDR2267 



mmelf eLORF3 P 
mmeLrel2lP 
mme 

mmeNMA1791 
mmeBSU0677 
mmegcry 
mmePflQ8 
saro3834 
mmeMSI13 5 
mmeCC0826 
mmeDR0119 . 1 
mmeDR2267 



mmelfeL0RF3P 
mmeLrel2lP 
mme 

mmeNMA1791 
mmeBSU0677 
mmegcry 
mmePf 1Q8 
saro3834 
mmeMSI135 
mmeCC0826 
mmeDR0119 . 1 
mmeDR2267 



751 

VLFNDFHIFI NYGYRSFEWN NEAANKAKVD WIVGFSTK. 

LLLS . LGIKI NFAHRTFSWT NEASGVAAVH CVIIGFGLKD 
SLLN. QGIEI HFAHRTFQWT SQAAGKAAVH CIIVGFRQKP 
EIrFK . FGIQI NFAYKSFKWA NNAKNNAAVI WIVGFG. 
PIFKA.GWRI RFAHRTFAWD SEAPGKAAVH CVIVGFDKES 



800 

. EDKNPTIYD 
-VNKDKILYN 
. . SDEKIIYE 
PMPS EKTLYD 
PLDTKVNKYL 
. . QPRPRLWD 



RILKSAN . .V KFAYRPFRWS NSAANNAGVY CTIIGLTGSE VSNKK. 
RIIA.ES.RL FEAWSDEPWV VDG. . . AAVR VSLICFGHG. .EDPLCL*. 
PIAD.AG.AL MEAWADEPWA LEG. • .AAVR VSMFGFGDG. . FAERRL . 
RIKA.TG.DL FMAWPDEPWQ QNG. . .AAVR VSLFGFDNG. . TETIiRT 
YWQHGG.TI TDAVGTQVWS GD AAVH VSIVNWVKGP AEGPKHLAWQ 



801 

EQKIIS 

SSN*ISH. . . 
YES INGEPLA 
YPDIKGEPEK 
FVD . . . ETKK 
YPDVKGEPVS 



. . A . KHINQY 
. . C . KNINGY 
IKA.KNINPY 
HAV . ANINPY 
L.V.SNISPY 
VEVGQSINAY 



MYDSDNIFID 
LFDGNNIFV. 
LRDGVDVIA . 
LIDAPDLII . 
LTDGENILV. 
LVDGPNVLVD 



• - • LFGEGSV VEC . SSIAPY 

. .DGRT . .VAQINAD 

. . EGRK AEHLHSD 

LNDGH VGVINAD 

VGDHRTS PWQ STELPVINSA 
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